The popular open source narrative game development framework, Yarn Spinner, which is maintained by Secret Lab and a fabulous community, has reached version 1.0. As part of our 1.0 release, we’ve debuted 5 exciting new features:
Compiled Scripts — Yarn Spinner now compiles to a binary format.
Automatic Compiling — In Unity, your Yarn scripts will automatically be compiled when they change.
Line Tagging — You can automatically add unique tags to lines of dialogue, and generate a .csv file to send to translators with the click of a button.
pie_price = 3.14
num_pies = int(input("How many pies?"))
pie_owing = pie_price * num_pies
if pie_owing > 10:
print("You're over the pie budget")
How do you test that the line that prints “you’re over the pie budget” can run? One way is to just run the program, type in a large number, and verify that you see it.
But what if you couldn’t ask for input? For example, maybe this part of the code is buried deep within a larger process, and reaching it is tricky; maybe the code under test is operating in a continuous integration environment, and no user input is available. What do you do then to ensure that this line is reachable?
Why, producing a formal proof, of course. It’s the only sensible way.
In this post, we’ll walk through the theory and practice of using symbolic execution, a static analysis technique, for bug discovery. In particular, we’ll focus on a specific type of bug: how can we prove that a line of code is, or is not, reachable?
How to solve it
Let’s start by reframing the question into something more formal:
Or, to put it another way:
Let’s work the problem by doing it by hand. Here’s the code again:
pie_price = 3.14
num_pies = int(input("How many pies?"))
pie_owing = pie_price * num_pies
if pie_owing > 10:
print("You're over the pie budget")
We know from the first line that pie_price is 3.14. However, we don’t know the value of num_pies, because it depends upon user input. In order for any of the rest of the code to work, though, we need to have a label for the value stored in num_pies.
This is where the symbol in symbolic execution comes in: we’ll introduce a symbolic value – let’s call it 🥧 – and declare that the variable num_pies contains 🥧. We don’t know anything about what’s stored in 🥧, but we do know some facts about it.
Specifically, we know a single fact about it right now: 🥧 is an integer, which means that it supports any operation that other integers support: addition, multiplication, comparison, and so on.
Our next line, pie_owing = pie_price * num_pies, has a similar problem: we can’t know the value of pie_owing, because it’s the result of a multiplication between a known (or concrete) value and the symbolic value 🥧. So, what do we store in pie_owing? We’ll store the entire expressionpie_price * 🥧 in there.
The final line of code before the print statement is a conditional: if pie_owing > 10. If we proceed on to the next line, then it follows that the value of pie_owing – whatever it is – is greater than 10.
We now have enough information to put together a collection of logical assertions that must be true in order to reach the print statement. They are:
pie_price = 3.14
num_pies = 🥧
pie_owing = pie_price × num_pies
pie_owing > 10
Great. Our next question is: can we demonstrate that these equations can all be true at the same time?
To answer our question, we’ll construct several equations that represent the constraints on the input that are in place when the print line is reached, and feed them into a solver; we can then ask the solver to check to see if they can be true at the same time.
from z3 import *
# Create the solver
s = Solver()
# Declare our variables: "pie_price", which we know the
# value of, "num_pies", which we don't, and "pies_owing", which depends upon the values of the other two
pie_price = Real('pie_price')
num_pies = Int('num_pies')
pies_owing = pie_price * num_pies
# Assert that pie_price is equal to 3.14
s.add(pie_price == 3.14)
# Assert that pies_owing is greater than 10
s.add(pies_owing > 10)
# Ask if these these can be true at the same time
s.check() # returns "sat" - they can be!!
We’ve now demonstrated that in order to reach the line, print("You're over the pie budget"), a set of equations must be true at the same time; additionally, Z3 indicates that they can indeed be. Therefore, we’ve proved that the line is reachable, and we never needed to ask the user for input.
Incidentally, we can ask Z3 to produce a model of its solution, which means it will produce a value for all of the variable in question, including num_pies – the value we’d ordinarily get from the user. That is, Z3 can produce a value for num_pies that would result in the print statement to run.
s.model()[num_pies] # 4
Generating the Equations Automatically
In the previous example, we had to read through the code and manually produced the equations that are in place. Wouldn’t it be nicer, though, if we could have a system do this for us?
To do this, we’ll take advantage of the fact that Python is very easy to decompile into byte code. Using the dis module, we can take any Python function, and produce the byte code that represents it. Converting the code to byte code is important, because byte code is significantly simpler, and easier to analyse.
Once we have the byte code, we need to find a way to determine the possible paths through the code that execution can take, depending on the inputs given the program. We then need to determine the constraints on the variables that affect the path; if at any point the constraints are not compatible with each other, the path is impossible. If all paths that reach a line of code are impossible, then the line of code is unreachable under any circumstance.
For example, consider this snippet of code:
i = 1
if i == 0:
There is theoretically a path of execution that goes from line 1, through line 2, and ends at line 3, but if you think about it, it would require the variable i to be equal to 0 and also to 1. This is impossible, and as a result, the path is impossible; because this is the only path that reaches line 3, that line is unreachable.
This means that our next problem is: given a block of code, how do we calculate the possible paths through that code?
Basic Blocks and Control Flow Graphs
As before, let’s start with a chunk of code, which we’ll use as our example.
x = 0
if a > 0 and a < 5:
x = 1
b = a + 1
if x == 1 and b > 6:
Our question for this code is: can the final line of code, print("Hello"), ever be reached? And can the process of discovering this be automated?
Let’s start by asking dis for the byte code.
This produces something like this (I’ve truncated it to the first few lines):
Each one of these lines is a low-level instruction to the Python virtual machine. The Python VM is a stack machine, which means that the instructions work by pushing and popping values on a stack. For example, the LOAD_CONST and LOAD_FAST operations push values onto the stack (either a constant value or a value stored in a variable), while the COMPARE_OP operation pops two values off the stack, compares them, and pushes the result back onto the stack. Additionally, certain instructions are responsible for controlling the flow of execution: the POP_JUMP_IF_FALSE instruction pops a value off the stack, and if it evaluates to False, jumps to a numbered instruction; if it evaluates to True, it proceeds to the next instruction instead.
How, then, can we find the possible paths through the code? One popular approach is to decompose the stream of instructions into basic blocks: runs of instructions that are only ever entered at the start, and only ever exit at the end (that is, it is impossible for the program to jump to a point that’s in the middle of a basic block).
To determine these basic blocks, the instructions are scanned, and certain instructions are marked as leaders:
The first instruction is a leader.
Instructions that are the destination of a jump are leaders.
Instructions following a conditional jump are leaders.
Instructions following a ‘stop’ instruction are leaders.
Once you know the leaders, you can then group up the instructions according to the most recent leader.
Next up, you form the connections between the blocks. Blocks have successors (blocks they lead to), and predecessors (blocks that lead to them.)
Blocks that end in an unconditional jump have one successor – the target of the jump.
Blocks that end in a conditional jump have two successors – the target of the jump, and the next instruction.
Blocks that end in a ‘stop’ instruction have no successors.
All other blocks have a single successor: the following instruction’s block.
With these rules in mind, we can take the byte code for our example function, and figure out the blocks.
Given these blocks and the way they link together, we can generate the control flow graph of the program. This graph shows how the blocks connect, and allows us to find the paths that execution can take through the program.
We’re now ready to start testing the paths that lead to the print("Hello") function call, which is the second-to-last basic block (it’s the blue block, second from the right of the above image.) For the purposes of this article, we’ll select one of them arbitrarily, and prove that the path is valid or not; the same steps apply for testing any path.
Finding impossible paths
In order to perform normal execution of the code, Python steps through each instruction, and performs them as normal – loading data into memory, requesting that the system get input, and so on. However, this only works when we’re running the entire program, which includes all of the work done to decide what parameters to use when calling the function test. When we perform symbolic execution, and are examining only portions of the program, we no longer have the ability to know concrete values for every variable.
Let’s take a closer look at the first basic block:
The third instruction in this disassembly loads the contents of the variable a, and pushes it onto the stack. However, a is a parameter to the function, which means it’s not possible to get a concrete value for the variable when considering the code in isolation.
This means that when we encounter the instruction LOAD_FAST a, we need to introduce a new symbolic value. That’s not the only symbolic value we need to track, though: on lines 1 and 2, we load the number 0, and store it in the variable x. This means that we need to declare to Z3 that the variable x exists, and assert that it is equal to 0.
Additionally, if we’re testing a specific path through the code, we already know whether the POP_JUMP_IF_FALSE will jump or not. In the case of our selected path, if we’re proceeding from the first block to the second, it means that we’re taking the path in which the value on the stack is True. This mean that we also assert that the result of comparing if a is greater than 1 is True.
In effect, setting a variable now means creating and recording an assertion that the variable contains a certain value, and when encountering a conditional jump, we assert that its condition is true (if we’re taking the true path), or false (if we’re not).
We continue this execution, creating additional constraints on the values as we encounter instructions that interact with them, and at the end of each block, we feed them into Z3 and ask if the assertions are compatible with each other. If they’re not, then the path is impossible, and we try again with a different path. If all of the paths that reach a block are impossible, then that block is unreachable under any circumstances.
In the specific case of our example, the line print("Hello") is unreachable. For it to be reached, it would require either the value of a to be both greater than 5 and less than 5 at the same time.
Symbolic execution is really fun and useful, but it isn’t without its drawbacks. In particular:
It’s vulnerable to an explosions in the number of paths, especially when looping (and especially if the code can potentially loop infinitely)
If the same region of memory is referred to by two variables, it can be challenging for the analyser to detect this condition
Elements in a collection require special handling; do you treat the collection itself as a value, or the values in the collection as individual values?
It’s a lot more challenging in dynamically typed situations, where you don’t necessarily know the operations that can be performed on the values that you receive.
Nonetheless, it’s a fascinating field to play in, and can be tremendously rewarding. The video of the talk that I gave at PyCon AU 2019 is embedded below.
There’s a few reasons why I’ve been streaming my code. The field that I work in, independent game development, can be a pretty personality-oriented area. Because of this, it’s often important to develop the 😎 personal brand 😎. Videos are great at this, because it’s an opportunity to have your face and voice attached to the cool things you’re working on.
Streaming your code is also an excellent way to stay very, very focused on a single task. If you’re coding as part of a performance – and live streaming is very much a performance – you’re a lot less likely to get distracted and look at the internet for four hours.
Finally, having an audience of people looking at your code means you can do something I like to think of as multicore pair programming: you often get great feedback and advice from people watching you code. I’ve solved a number of bugs thanks to input from people who are watching me work.
Where Should You Stream?
There’s a number of different options for streaming sites. The best-known sites for the kind of streaming that I do are:
Twitch: Very games focused, and a very large population. (I do my streams here.)
Mixer: Microsoft’s streaming site. Also games focused, but a smaller population; designed for very low latency.
YouTube Live: General video focused, and seems to be more designed for ‘event’-style broadcasts.
I use Twitch, largely because I work in games, so I piggy-back on the existing topic interest. It’s also very well supported by the various streaming tools and services, and brand recognition is high – if someone describes themselves as a streamer, it’s likely that they stream on Twitch.
How Do You Stream?
You don’t need a huge amount of software to stream; at minimum, you just need something that can upload a stream to your platform. The software that I use is OBS, which is a very nice (and very free) package that:
Captures your display and webcam
Composes it into a scene
Compresses and uploads the stream to your platform.
As far as gear goes, you also don’t need much. It’s very tempting to assume that you need lots of expensive equipment in order to be professional, but you really don’t – at minimum, all you need is your computer, and an internet connection.
If you have a webcam, that’s great! If you have a good microphone, that’s also great! But you don’t need it, and I want to be clear that you should pointedly ignore anyone trying to convince you that you do.
When I stream from my office, I happen to use a decent headset mic, so that I don’t have to think about it as much, plus a USB audio interface that lets me connect it to my computer. When I’m feeling ~fancy~, I connect a camera via an HDMI-USB interface, so that I can show my phone. That’s really it!
Because the content that I stream doesn’t have its own soundtrack, I play music while I work. This is for two reason: it shows off my frankly exquisite taste, and also means that there’s no dead air when I’m not speaking.
However, when you’re doing broadcast work, you can’t just stream your music library – you don’t have the license for it, your videos will get muted, and you run the risk of your account being banned.
This post doesn’t exist without Suz Hinton’s write-up of her live coding setup. It’s got specific advice on setup, performance, and management of live coding, and was instrumental in getting me started. Go read it!
One of the reasons why we think Yarn Spinner is powerful is that it’s designed to focus on as little as possible. There’s literally only three things that Yarn Spinner can do with your game: it can send lines of dialogue, ask for a selection among a group of options, and send a command for your game to interpret.
The idea behind this is that your game can add what it needs to on top, rather than being forced to fit inside the ideas that we had when we first wrote the system. There are several excellent dialogue systems that are designed to be very good at operating within the structure of a role-playing game, or a choose-your-own-adventure system (Twine is a great example of this last one), but for Yarn Spinner, we wanted the system to be more generalised, and able to be applied to a wide variety of games systems.
The consequence of doing that, however, is that a game needs to do more work to add the features that it needs. While we built Yarn Spinner with NITW in mind, there are several features that are quite specific to the game.
In this post, we’ll highlight some of the cooler things that Alec Holowka, the lead developer of Night in the Woods, built on top of the Yarn Spinner system to support its needs.
Here’s an example of the kind of dialogue that exists in Night in the Woods. Here’s Mae and Gregg, planning on going to Donut Wolf:
Gregg: They got pancakes now! 🙂
//angus walks across the screen and off the left//
<<walk Angus AngusOffLeft>>
Gregg: \o/ D:
Gregg: RIDE THE CHARIOT!
<<dilate Mae .85 .5>>
Mae: TO DONUT HELL!!! \o/
This dialogue is part of the raw source code of Night in the Woods. When the scene loads, the dialogue attached to the characters is parsed; first, it’s parsed into a data structure called a parse tree, which is then converted into a simple binary representation. This process is quite similar to how other code gets compiled into a binary that can be run on a machine.
At the end of this process, the resulting bytecode for the above dialogue snippet is this:
Each line of this bytecode is then executed, one after another, by Yarn Spinner, which sends along the lines, options and commands that it encounters. This is part of the standard behaviour of how Yarn Spinner works in any game; where Night in the Woods differs is what it does with its lines.
Night in the Woods shows its dialogue in speech balloons that are attached to the characters. In order to correctly position them, NITW needs to know which character a line should be shown attached to; to figure that out, NITW looks at the start of each line, and figures out if it begins with a name followed by a colon.
If it does, then the game checks to see if that character is in the scene; if they are, the speech balloon for the line is attached to that character.
Emoticons, or “what we used to use before emoji were a thing”, are those little faces that are composed out of plain text – stuff like 🙂 and :(. Night in the Woods uses these to control player expressions, by looking for any emoticons that it recognises. When it encounters one, it triggers a corresponding animation action, such as animating from the neutral expression to a smile.
There are several different kinds of emoticons. In addition to the smileys that control facial expressions, the game also includes several gestures as well; for example, whenever Mae puts her hands on her hips, it’s because the line included the emoticon “<o>”; when she throws her hands up into the air, it’s triggered by “\o/“.
This is true for other characters as well. When Gregg starts flailing his arms after Mae meets him in the Snack Falcon at the start of the game, it’s the result of using the emoticon sequence “:) \o/“
Night in the Woods responds to custom syntax within lines that let a writer mark up the dialog for visual effects. For example, when a character needs to shout, the line contains markup like this:
This causes the specific range of letters to be delivered in a shaky style. The markup also supports effects like character size, color, and balloon position, as well as fun effects like making a line of text wave.
Finally, Night in the Woods makes extensive use of commands. In Yarn Spinner, if you wrap some text in <<angle brackets>>, it won’t be sent to the game like a line of dialogue. Instead, it’s delivered as a command; the intent is that the game will manually parse its contents, and perform some action in the game.
Every time you see a character do anything besides talk, it’s the result of an action. There are a huge number of possible commands in the game; on top of the usually expected actions like sit, stand, walk, and jump, commands can also be used to control where and how a character is looking, the dilation of their eyes, and whether they’re visible or not. This last command is extremely important, since it’s used to control whether a character is present in a scene on a given day.
What’s interesting about NITW is that just about all of the gameplay logic is driven through Yarn Spinner. The language wasn’t designed to be a general game logic control system but it turns out it’s pretty good at this. That’s cool.
Taking it further
Several of these are very specific to NITW and its systems, while others are things that could probably apply to most games. As we continue to work on Yarn Spinner, we’re looking forward to making the project the best dialogue system it can be. For more information on Yarn Spinner, visit the project’s page on GitHub at https://github.com/thesecretlab/YarnSpinner.