Analysing Performance Problems with Xcode Performance Tools, Part 1

This is the first post in a two-part series. Stay tuned for the second part.

One of the projects that we’re most excited about is our ongoing work to bring Night in the Woods to iOS. This has been an enormous amount of technical effort, and has led to a bunch of very cool spin-off projects – for example, Yarn Spinner only exists because we built it for Night in the Woods, and the research we’ve been doing in sprite compression to fit the game into a tiny amount of memory has been a blast.

A screenshot of Night in the Woods.
Night in the Woods.

One of the things we’re doing right now is improving the performance of the game on the device. Just like when you’re building a game for PCs or consoles, you want your game to be running at a high, stable frame rate. For mobile devices, this is a little more complex – you also have to consider the impact that the game has on the phone’s battery, because nobody wants to play a game and then find that their phone’s about to die.

Another important element of performance for mobile games is thermal impact. When the phone’s running a game, it’s making the hardware do a lot of work, and this makes it heat up. Modern systems – that is, anything made in the last 40 years – detect when they’re under thermal stress, and reduce their performance to avoid damage. This means that you might be able to achieve a solid 60 frames per second when you start playing, but that might get laggy in a couple minutes (or seconds!) of play. (Thermal stress is also a consideration for PCs and consoles, but it’s more critical for phones due to the reduced amount of space inside the chassis, the lack of active cooling, and the fact that you’re holding it in your hands.)

Spotting the Problem

The first thing that we noticed was that the reported frames per second in the game was low.

(We use an FPS counter called Graphy to visualise FPS. Graphy’s great – it’s easy to set up, reliable, low-impact, and open source. You should use it!)

The FPS counter was reporting an FPS count of about 20 to 40 frames per second on my iPhone X, depending on the scene. The other thing that was spotted was that the frame rate was inconsistent to boot.

Inconsistent frame rates in the Towne Centre East scene. Note the slightly jerky camera movement.

What was going on? Night in the Woods, despite its gorgeous visual style, doesn’t actually have hugely complex scenes. When diagnosing a problem like this, the first thing to do is to figure out where the bottleneck is – on the CPU, or the GPU.

Finding the Bottleneck

Xcode has a very simple tool for checking which part of the system is under the heaviest load. When running the game from Xcode, you can just click on the Debug navigator, which will show you a summary of the app’s performance.

A screenshot of Xcode's debug view. It shows the CPU at 79%, memory usage at 859MB, energy usage as Very High, disk usage at 80 KB/s, network usage at 2.9MB/s, and a frames per second count of 37.

The CPU usage and energy impact here are pretty high. Not enormously high, considering it’s a game, but still high. As the game itself reports, it’s barely achieving 30FPS frame rate. When we select the FPS element, we get some more detail:

A screenshot of the GPU load level. It shows the tiler's load at 7%, renderer at 86%, device utilisation at 100%, and CPU and GPU both taking 28.4 milliseconds to render.

The FPS report is showing that the device is 100% utilised, and that the CPU and GPU are both taking a full 28 milliseconds to produce the frames. The renderer is under significantly more load than the tiler, which effectively means that most of the work is being done by the fragment shaders, and not by having to deal with a lot of geometry (something that we’d expect, given that the polygon count of Night in the Woods, like most other 2D games, is not terribly high.)

Even though the CPU and GPU times were the same, this doesn’t necessarily mean that they’re under the same degree of load. Switching over to the CPU report showed that the CPU wasn’t really at maximum load (though this can be deceiving, since it may have been the case that a single core was at max load.)

The CPU load indicator, showing 87%.

The analysis that we’d seen so far seemed to suggest that the GPU might be the best place to look at, so we pulled out one of the two biggest and baddest tools in the chest: Instruments.

The application icon for Instruments.
.

Instruments can show a huge amount of data about the performance of an app. Happily, recent versions of the app come with the Game Performance template, which sets up a number of recorders that relate to how a game performs.

The instruments template selector. The "game performance" template is selected.

So, we ran our scene while recording data, and got back a simply enormous amount of data. It’s actually rather pretty.

There’s a lot of charts here, but the key thing to look at is at the bottom of the window, where the GPU state and display information is shown. Let’s zoom in on that.

A close-up view showing performance data from Instruments.

The purple bar at the top shows when the GPU was busy doing something. The bar is entirely filled, which means that there was no point in the run when the GPU was allowed to be idle. That’s bad, because when the GPU is active, it’s pulling energy out of the battery, and heating up the phone. We already knew this, because Xcode’s report was showing us that the device utilisation was 100%, but it’s nice to see this in a little more detail.

Where we start to see more useful information is in the ‘Display’ instrument. The Display instrument shows four charts:

  1. The current frame shown on the screen
  2. When a new frame was delivered to the screen, ready to be shown
  3. When the screen vsync’ed (that is, when it attempted to swap to the next available frame)
  4. Any frame stutters that Instruments found. There aren’t any in this screenshot, but that doesn’t mean that the display isn’t stuttering.

Take a close look at the top row of the Displays instrument. It’s showing the duration of each frame on screen, and they’re not all the same. Some are 33 milliseconds, some are 16. That’s what’s causing the janky appearance.

What’s interesting about this is that the frames are taking the same amount of time to be produced! We know this because the second row, labelled ‘scaler’, shows that each frame is being submitted to the GPU after about the same amount of time. The reason why some frames list for 33 milliseconds and some last for 16 is due to the fixed rate at which the display is swapping to the next available frame.

Understanding Frame Judder

To figure this out, let’s look at the timing information for four frames.

A close-in screenshot of Instruments, showing the timing of four frames, coloured blue, green, orange and blue again.

Here’s what’s happening at each of these four steps.

  1. The blue frame is on screen. The next frame, shown in green, is submitted to the display.
  2. The display hits its next vsync point. The green frame is ready, so it’s shown to the user.
  3. The orange frame is submitted to the screen, but it’s just too late for vsync. The screen therefore has no new frame to show, so it keeps showing the green frame for another sync interval. The orange frame is eventually shown at the next vsync.
  4. Meanwhile, the blue frame was being drawn, and it’s ready to go right away. It’s submitted to the display, and drawn right after that. The orange frame was only on screen for a single sync interval.

The problem here is not that the frames aren’t taking too long to draw. The problem is that the game is trying to send them to the screen at too high a rate. Ironically, in order to make the game smoother, we need to reduce the frame rate.

This was a one line change:

Application.targetFrameRate = 30;

The result was this:

A screenshot of Instruments, showing frame timing data. The timing of the frames is more consistent.

There are two things to note here.

First, the purple area now has gaps in it, indicating periods of time when the GPU was not active. This is a good thing! Less energy being drawn from the battery, and less heat.

Secondly, all of the frames are on screen for a consistent amount of time. They’re never missing a vsync, because the game isn’t trying to get them onto the screen as fast as it can. The game can now focus on operating at a consistent 30 frames per second.

The result is lower power consumption, and a smoother gameplay experience.

We’re Not Done Yet

But this was only part of the problem. As I mentioned earlier, the scenes in Night in the Woods are not incredibly complex; why was it taking so long to produce the frames? To solve that question, we needed to take a much closer look at the internals of how the GPU was drawing each frame.

We’ll look at how we achieved even better results in the next post. In the meantime, I’ll tease you with this:

A screenshot of Instruments, showing significantly less GPU usage.

Coding in Public

Recently, I’ve been live-streaming development sessions of Night in the Woods. I’m really enjoying it, and I thought I’d write up some notes on how I’ve done it, and give some tips I’ve picked up on how to get the most out of it.

Why Should You Code In Public?

There’s a few reasons why I’ve been streaming my code.Ā The field that I work in, independent game development, can be a pretty personality-oriented area. Because of this, it’s often important to develop the šŸ˜Ž personal brand šŸ˜Ž. Videos are great at this, because it’s an opportunity to have your face and voice attached to the cool things you’re working on.

Streaming your code is also an excellent way to stay very, very focused on a single task. If you’re coding as part of a performance – and live streaming is very much a performance – you’re a lot less likely to get distracted and look at the internet for four hours.

Finally, having an audience of people looking at your code means you can do something I like to think of as multicoreĀ pair programming: you often get great feedback and advice from people watching you code. I’ve solved a number of bugs thanks to input from people who are watching me work.

Where Should You Stream?

There’s a number of different options for streaming sites. The best-known sites for the kind of streaming that I do are:

  • Twitch: Very games focused, and a very large population. (I do my streams here.)
  • Mixer:Ā Microsoft’s streaming site. Also games focused, but a smaller population; designed for very low latency.
  • YouTube Live:Ā General video focused, and seems to be more designed for ‘event’-style broadcasts.

I use Twitch, largely because I work in games, so I piggy-back on the existing topic interest. It’s also very well supported by the various streaming tools and services, and brand recognition is high – if someone describes themselves as a streamer, it’s likely that they stream on Twitch.

How Do You Stream?

You don’t need a huge amount of software to stream; at minimum, you just need something that can upload a stream to your platform. The software that I use is OBS, which is a very nice (and very free) package that:

  • Captures your display and webcam
  • Composes it into a scene
  • Compresses and uploads the stream to your platform.

As far as gear goes, you also don’t need much. It’s very tempting to assume that you need lots of expensive equipment in order to be professional, but you really don’t – at minimum, all you need is your computer, and an internet connection.

If you have a webcam, that’s great! If you have a good microphone, that’s also great! But you don’t need it, and I want to be clear that you should pointedly ignore anyone trying to convince you that you do.

When I stream from my office, I happen to use a decent headset mic, so that I don’t have to think about it as much, plus a USB audio interface that lets me connect it to my computer. When I’m feeling ~fancy~, I connect a camera via an HDMI-USB interface, so that I can show my phone. That’s really it!

Because the content that I stream doesn’t have its own soundtrack, I play music while I work. This is for two reason: it shows off my frankly exquisite taste, and also means that there’s no dead air when I’m not speaking.

However, when you’re doing broadcast work, you can’t just stream your music library – you don’t have the license for it, your videos will get muted, and you run the risk of your account being banned.

Instead, stream music thatĀ is licensed for broadcast. I happen to play music that I’ve received direct permission from the composer to play (such asĀ Alec Holowka’s superb soundtrack to Night in the Woods), or Pretzel, a streaming service that plays rather good licensed-for-broadcast music.

Where To Learn More

This post doesn’t exist without Suz Hinton’s write-up of her live coding setup. It’s got specific advice on setup, performance, and management of live coding, and was instrumental in getting me started. Go read it!

I hope this has gotten you interested in this, and if you start streaming yourself, I’d be delighted if you let me know!

How Night in the Woods Uses Yarn Spinner

We recently announced that we’re building Night in the Woods for mobile! We’re super excited about this, so we thought that we’d share a bunch of technical behind-the-scenes stuff on our blog over the coming weeks and months. This is the first of those posts! 

Yarn Spinner is the dialogue engine that we wrote, and was used in Night in the Woods. It’s open source, freely available, and we think it’s pretty powerful.

One of the reasons why we think Yarn Spinner is powerful is that it’s designed to focus on as little as possible. There’s literally only three things that Yarn Spinner can do with your game: it can send lines of dialogue, ask for a selection among a group of options, and send a command for your game to interpret.

The idea behind this is that your game can add what it needs to on top, rather than being forced to fit inside the ideas that we had when we first wrote the system. There are several excellent dialogue systems that are designed to be very good at operating within the structure of a role-playing game, or a choose-your-own-adventure system (Twine is a great example of this last one), but for Yarn Spinner, we wanted the system to be more generalised, and able to be applied to a wide variety of games systems.

The consequence of doing that, however, is that a game needs to do more work to add the features that it needs. While we built Yarn Spinner with NITW in mind, there are several features that are quite specific to the game. 

In this post, we’ll highlight some of the cooler things thatĀ Alec Holowka, the lead developer of Night in the Woods, built on top of the Yarn Spinner system to support its needs.

Example Dialog

Here’s an example of the kind of dialogue that exists in Night in the Woods. Here’s Mae and Gregg, planning on going to Donut Wolf:

Gregg: They got pancakes now! šŸ™‚
<<close>>
//angus walks across the screen and off the left//
<<walk Angus AngusOffLeft>>
<<wait 3>>
Angus: fine.
<<lookUp Mae>>
<<lookUp Gregg>>
Gregg: \o/ D:
Gregg: RIDE THE CHARIOT!
<<dilate Mae .85 .5>>
Mae: TO DONUT HELL!!! \o/
<<runNextLinesTogether 2>>
Mae: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOO!![/shake]
Gregg: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOO!![/shake]

This dialogue is part of the raw source code of Night in the Woods. When the scene loads, the dialogue attached to the characters is parsed; first, it’s parsed into a data structure called a parse tree, which is then converted into a simple binary representation. This process is quite similar to how other code gets compiled into a binary that can be run on a machine.

At the end of this process, the resulting bytecode for the above dialogue snippet is this:

Node GreggFQ4Intro:
     0   L0:
             RunLine         GreggFQ4Intro-0           
             RunCommand      close                           
             RunCommand      walk Angus AngusOffLeft                      
             RunCommand      wait 3                          
     5       RunLine         GreggFQ4Intro-1          
             RunCommand      lookUp Mae                      
             RunCommand      lookUp Gregg                      
             RunLine         GreggFQ4Intro-2         
             RunLine         GreggFQ4Intro-3        
    10       RunCommand      dilate Mae .85 .5                      
             RunLine         GreggFQ4Intro-4       
             RunCommand      runNextLinesTogether 2                      
             RunLine         GreggFQ4Intro-5      
             RunLine         GreggFQ4Intro-6     
    15       RunCommand      close                           
             RunCommand      irisOut 1 wait                      
             RunCommand      sectionTitle GreggFQ4Intro BeaCar                      
    18       Stop                                            

String table:
GreggFQ4Intro-0: 
    Gregg: They got pancakes now! šŸ™‚ (GreggFQ4Intro:1)
GreggFQ4Intro-1: 
    Angus: fine. (GreggFQ4Intro:6)
GreggFQ4Intro-2: 
    Gregg: \o/ D: (GreggFQ4Intro:9)
GreggFQ4Intro-3: 
    Gregg: RIDE THE CHARIOT! (GreggFQ4Intro:10)
GreggFQ4Intro-4: 
    Mae: TO DONUT HELL!!! \o/ (GreggFQ4Intro:12)
GreggFQ4Intro-5: 
    Mae: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOOOOOOO!![/shake] (GreggFQ4Intro:14)
GreggFQ4Intro-6: 
    Gregg: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOOOOOOO!![/shake] (GreggFQ4Intro:15)

Each line of this bytecode is then executed, one after another, by Yarn Spinner, which sends along the lines, options and commands that it encounters. This is part of the standard behaviour of how Yarn Spinner works in any game; where Night in the Woods differs is what it does with its lines.

Character Names

Night in the Woods shows its dialogue in speech balloons that are attached to the characters. In order to correctly position them, NITW needs to know which character a line should be shown attached to; to figure that out, NITW looks at the start of each line, and figures out if it begins with a name followed by a colon.

If it does, then the game checks to see if that character is in the scene; if they are, the speech balloon for the line is attached to that character.

Emoticons

Emoticons, or ā€œwhat we used to use before emoji were a thingā€, are those little faces that are composed out of plain text – stuff like šŸ™‚ and :(. Night in the Woods uses these to control player expressions, by looking for any emoticons that it recognises. When it encounters one, it triggers a corresponding animation action, such as animating from the neutral expression to a smile.

There are several different kinds of emoticons. In addition to the smileys that control facial expressions, the game also includes several gestures as well; for example, whenever Mae puts her hands on her hips, it’s because the line included the emoticon ā€œ<o>ā€; when she throws her hands up into the air, it’s triggered by ā€œ\o/ā€œ.

This is true for other characters as well. When Gregg starts flailing his arms after Mae meets him in the Snack Falcon at the start of the game, it’s the result of using the emoticon sequence ā€œ:) \o/ā€œ

Markup

Night in the Woods responds to custom syntax within lines that let a writer mark up the dialog for visual effects. For example, when a character needs to shout, the line contains markup like this: 

Mae: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOO!![/shake]

This causes the specific range of letters to be delivered in a shaky style. The markup also supports effects like character size, color, and balloon position, as well as fun effects like making a line of text wave.

Commands

Finally, Night in the Woods makes extensive use of commands. In Yarn Spinner, if you wrap some text in <<angle brackets>>, it won’t be sent to the game like a line of dialogue. Instead, it’s delivered as a command; the intent is that the game will manually parse its contents, and perform some action in the game.

Every time you see a character do anything besides talk, it’s the result of an action. There are a huge number of possible commands in the game; on top of the usually expected actions like sit, stand, walk, and jump, commands can also be used to control where and how a character is looking, the dilation of their eyes, and whether they’re visible or not. This last command is extremely important, since it’s used to control whether a character is present in a scene on a given day.

What’s interesting about NITW is that just about all of the gameplay logic is driven through Yarn Spinner. The language wasn’t designed to be a general game logic control system but it turns out it’s pretty good at this. That’s cool.

Taking it further

Several of these are very specific to NITW and its systems, while others are things that could probably apply to most games. As we continue to work on Yarn Spinner, we’re looking forward to making the project the best dialogue system it can be. For more information on Yarn Spinner, visit the project’s page on GitHub at https://github.com/thesecretlab/YarnSpinner.

If you’re interested in more information on how we made Night in the Woods better with our open source software,Ā check out Jon’s talk from GDC 2017 šŸŽ¬.

Follow us on Twitter for more news and updates: @thesecretlab