Installing Unity ML-Agents

⚠️ At OSCON, attending our tutorial? 🔗 Also open the docs!

Want to explore the Unity Machine Learning Agents Toolkit (“ML-Agents”)? Here’s the easiest way to get up and running on Windows or macOS.

Unity ML-Agents is a great way to explore machine learning, whether you’re interested in building AI for games, or simulating an environment to solve a broader ML problem, why not try Unity’s ML-Agents?

We’ll be posting a variety of guides and material covering various aspects of Unity’s ML-Agents, but we thought we’d start with an installation guide!


Interested in a quick introduction to Unity and ML-Agents? Check out the video of the talk we delivered at The AI Conference in New York City!


To use ML-Agents, you’ll need to install three things:

  1. Unity
  2. Python and ML-Agents (and associated environment and support)
  3. The ML-Agents Unity project

Unity

Installing Unity is the easiest bit. We recommend downloading and using the official Unity Hub to manage your installs of Unity.

➡️ Download the Unity Hub for Windows or macOS

The Unity Hub allows you to manage multiple installs of different versions of Unity, and lets you select which version of Unity you open and create projects with.

If you don’t want to use the Unity Hub, you can download different versions of Unity for your platform manually:

➡️ Download a specific version of Unity for Windows or macOS

We strongly recommend that you use the Unity Hub to manage your Unity installs, as it’s the easiest way to stick to a specific version of Windows, and manage your installs. It really makes things easier.

If you like using command line tools, you can also try the U3d tool to download and manage Unity install’s from the terminal.

Python and ML-Agents

Our preferred way of installing and managing Python, particularly for machine learning tasks, is to use the Anaconda Environment.

⚠️ Anaconda’s environments don’t quite work like virtualenv, or other Python environment systems that you might be familiar with. They don’t store things in the location you specify, they store things in the system-wide Anaconda directory (e.g. on macOS in “/Users/USER/anaconda3/envs/”). Just remember that once you activate them, all commands are inside the environment.

Anaconda bundles a package manager, an environment manager, and a variety of other tools that make using and managing Python environments easier.

➡️ Download the Anaconda installer for Windows or macOS

Once you’ve installed Anaconda, following these instructions to make an Anaconda Environment to use with Unity ML-Agents.

➡️ First, download 🔗 this yaml file, and execute the following command (pointing to the yaml file you just downloaded):

conda env create -f /path/to/unity_ml.yaml

➡️ Once the new Anaconda Environment (named UnityML) has been created, activate it using the following command in your terminal:

conda activate UnityML

The yaml file we provided specifies all the Python packages, from both Anaconda’s package manager, as well pip, the Python package manager, that you need to make an environment that will work with ML-Agents.

Doing it manually

You can also do this manually (instead of asking Anaconda to create an environment based on our environment file).

⚠️ You do not need to do this if you created the environment with the yaml file, as above. If you did that go straight to “Testing the environment”, below.

➡️ Create a new Anaconda Environment named UnityML and running Python 3.6 (the version of Python you need to be running to work with TensorFlow at the moment):

conda create -n UnityML python=3.6

➡️ Activate the Conda environment:

conda activate UnityML

➡️ Install TensorFlow 1.7.1 (the version of TensorFlow you need to be running to work with ML-Agents):

pip install tensorflow==1.7.1

➡️ Once TensorFlow is installed, installing the Unity ML-Agents:

pip install mlagents

Testing the environment

➡️ To check everything is installed properly, run the following command:

mlagents-learn --help

You should see something that looks like the following image. This shows that everything is installed properly.

The ML-Agents Unity Project

The best way to start exploring ML-Agents is to use their provided Unity project. To get it, you’ll need a copy of the Unity ML-Agents repository.

➡️ Clone the Unity ML-Agents repository to your system (see the note below if you’re coming to our OSCON tutorial!):

git clone https://github.com/Unity-Technologies/ml-agents.git

⚠️ If you’re coming to our OSCON session, please clone this repository instead: https://github.com/thesecretlab/OSCON-2019-Unity-ML-Agents

You should now have a directory called ml-agents. This directory contains the source code for ML-Agents, a whole of lot useful configuration files, as well starting point Unity projects for you to use.

➡️ You’re ready to go! If you’re coming to our OSCON tutorial, you’ll need a slightly different project which we’ll help you out with on the day!

We’ll have another article on getting started (now that you’ve got it installed) next week!


In Portland? At OSCON?
Attend our OSCON 2019 session on 15 July 2019 to learn more!

Power-Saving in Unity

Learn how to reduce the power consumption of a non-game Unity application, for mobile.

Recently, we were asked to build some software using Unity. That is, we weren’t asked to build a game, but instead, the client wanted an app.

There are a few reasons why you’d want to use Unity to build non-game software.

  1. Cross-platform support. One of Unity’s biggest selling points is the fact that you can write your code once, and Unity makes it easier to bring that code over to multiple platforms, like iOS and Android, as well as desktop platforms.
  2. Graphics support. Being a game engine, Unity is very good at tasks that involve processing either 2D or 3D graphics. In our case, we were asked to build an app for building comic book pages, and that means working with lots of sprites.
  3. C# coding environment. It’s almost always better to write your code in the native language for your chosen platform, but in cases where that’s not feasible, C# is quite good for most platforms. Unity provides a good, performant, and featureful implementation of the language, as well as the .NET runtime.

However, there are a few things that keep Unity from being a great tool for making non-game apps. In this post, we’ll look at one of them, and how to reduce its impact: power consumption in Unity-based apps

This post is largely written with mobile in mind, and with a particular focus on iOS. However, the technique is pretty broadly applicable.

Reducing Power Consumption

The most pressing issue is that Unity, like all game engines, re-draws its content every frame. That’s not something that you need to do in an app, where most of the frames are going to be identical to the previous one. Most of that work is going to waste, and that means wasted power. Wasted power is particularly bad on mobile devices, since it means a needless drain on the device’s battery.

This is particularly striking when you see that an empty scene – one with nothing more than a camera, rendering the skybox – consumes significant amounts of CPU resources. On my iPhone X, for example, rendering the empty scene at 30 frames per second consumes about 20% of the CPU.

To reduce this issue, you can reduce the rate at which Unity updates, by reducing the target framerate. This can be done in a single line of code:

// Request that Unity only update every 1/10th of a second
Application.targetFrameRate = 10; 

This will reduce the amount of work that Unity does, but it has a downside: Unity will only run the Update method on your scripts once per frame, which means it will take up to 100 milliseconds for your code to notice that the user pressed a button. Additionally, setting the framerate to a fixed rate like this means that any moving content on your screen will always visibly lag. On top of this, we still haven’t really solved the original problem: the screen is still updating, many times a second, and each time it does, there’s only a small chance that anything that the user sees will have changed.

The solution, then, is to find a way to make Unity never re-draw the screen unless something happens. What that “something” is depends upon the details of your app, but generally always includes things like user input: you want to re-draw the screen when the user taps the screen, or drags their finger over it, because that’s highly likely to mean they want to press a button or drag an object around.

Hacking the Render Loop

Unity provides a way to control the player loop – the set of things that Unity does every frame. This includes re-rendering the scene, but also covers tasks like clearing the render buffers, collecting input, and – most importantly – running the Update methods on scripts. Using the PlayerLoop class, you can inspect the contents of the player loop, remove certain subsystems, and add some of your own as well.

Or, you could blow the whole thing away.

using UnityEngine.Experimental.LowLevel;

// Get the default loop
var playerLoop = PlayerLoop.GetDefaultPlayerLoop();
            
// Remove _everything_ from it!! There are no rules! Unlimited power!!
playerLoop.subSystemList = null;

// Apply this new "player loop" - the game will immediately stop
PlayerLoop.SetPlayerLoop(playerLoop);

If you remove all subsystems from the player loop, you effectively remove almost all of the work that Unity does each frame. There’s still some overhead that can’t be disabled, like the code that actually invokes the player loop, but by doing this, we’re getting rid of almost all of Unity’s work.

Disabling the Renderer

One of the things that emptying the player loop doesn’t directly control is the fact that Unity will attempt to run parts of the render loop as long as a camera is active in the scene.

To work around this, we can just disable the camera. However, if you do this in an Update function, the screen’s display will have already been cleared at the start of the frame. As a workaround to this, we can disable the camera, and then immediately tell the camera to render the frame. Because we won’t be clearing the display at the start of the next frame (there won’t even be a next frame), the frame will remain on screen.

As a result, the amount of CPU usage is dropped significantly. In the following image, I’ve dropped the CPU down to 3%. It’s not zero, but it’s very close; in fact, at this level of usage, the biggest power drain on the device is the screen.

Putting it All Back

So, we’ve now managed to completely stop the player loop, at a tremendous energy saving. But now we have another problem: how do we wake the app back up again when the user interacts with the screen, if all of the code that checks for input is no longer being run?

The solution is to look for input events that come from the system, and use that to wake up the application. To do this, we’ll need the following things:

  1. A way to run native code when touch events occur
  2. A way to run C# code from that native code, which restores the player loop

Everything up until this point has been entirely cross-platform, and should work on all platforms. However, because we’re now looking at native code, we need to focus on the native code implementation details for a single one. In this post, we’ll look at iOS. If you’re an Android developer and want to contribute how you’d do this in Android, let us know!

To detect any touches, there are two places we could put our code: we could override the view controller that Unity places its view in, or we could go one level lower and detect all touches that the app receives. It’s actually simpler to do that, so let’s get started.

First, we’ll need to create a new subclass of UIApplication. This is different to the similarly-named UIApplicationDelegate; the UIApplicationclass is an object that represents the entire application, while the delegate is simply an object that responds to events that happen to the application.

You typically never need to create your own subclass of UIApplication, and Apple doesn’t recommend that you do it, unless it’s for the single specific purpose that we’re about to do here: intercept and process UI events, before forwarding them to the rest of the application.

So, let’s get started. First, we’ll create a new file in our Unity project, called TouchHandler.mm, and add the following code to it:

@interface TouchDetectingApplication : UIApplication

- (void)sendEvent:(UIEvent *)event;

@end

@implementation TouchDetectingApplication

- (void)sendEvent:(UIEvent *)event {
    
    // Handle touch event here!
    
    [super sendEvent:event];
}

@end

The sendEvent method will be run on every input event. It’s very important that we call the super implementation, since without doing that, no input will ever be delivered to the app. We’ll come back to this method in a bit.

Next, we need a way to notify our C# code that a touch has occurred. To do this, we’ll send a pointer to a C# method into the native code at game start; this method will restore the game loop, and resume rendering.

We’ll do all of this in a MonoBehaviour, which you can attach to an object in the scene. The following code also contains an example of how to stop and resume the camera, too.

public class FramerateManager : MonoBehaviour
{
    
    // A singleton instance of this class. Necessary, because the 
    // callback must itself be static; there are other ways to do 
    // this, but this serves fine for this example.
    private static FramerateManager instance;

    // The type of the C# callback method that the native code will 
    // call.
    public delegate void EventDelegate();

    // This method will be called from native code when a touch 
    // input arrives. This method must be static.
    [AOT.MonoPInvokeCallback(typeof(EventDelegate))]
    public static void TouchEventReceivedFromNativeCode()
    {
        // Restore the original player loop.
        instance.LeaveLowPowerMode();
    }

  #if UNITY_IOS && !UNITY_EDITOR
    // This is a native function that we'll call, and pass the 
    // TouchEventReceivedFromNativeCode method to as a parameter.
    [DllImport("__Internal")]
    private static extern void _AttachEventHook(EventDelegate actionDelegate);
  #endif

    // The number of frames remaining before we stop the loop. We 
    // leave a little buffer time after the last touch.
    private const int framesBeforeStopping = 5;
    private int framesRemaining;

    // A cached reference to the main camera. Necessary, because 
    // Camera.main only returns a valid value when there's an 
    // enabled camera in the scene.
    private Camera mainCamera;    

    void Awake()
    {
        // On game start, call the native method, and pass it the 
        // method we want it to call when touches occur. The native
        // code will keep a reference to this method as a function
        // pointer, and call it when it needs to.
#if UNITY_IOS && !UNITY_EDITOR
        _AttachEventHook(TouchEventReceivedFromNativeCode);
#endif

        // Set up our instance method.
        instance = this;

        framesRemaining = framesBeforeStopping;
    }

    private void Update()
    {
        // Count down until we're out of time.
        framesRemaining -= 1;

        // Time to stop.
        if (framesRemaining == 0)
        {
            EnterLowPowerMode();
        }
    }

    private void EnterLowPowerMode()
    {
        // Remove everything from the player loop
        var playerLoop = PlayerLoop.GetDefaultPlayerLoop();
            
        playerLoop.subSystemList = null;

        PlayerLoop.SetPlayerLoop(playerLoop);
    
        // Cache a reference to the camera
        mainCamera = Camera.main;

        if (mainCamera != null)
        {
            // Disable it!
            mainCamera.enabled = false;

            // We just disabled the camera, but if we called this 
            // in an Update function, it cleared the frame buffer 
            // before this frame started. To prevent being left 
            // with a blank screen, we manually re-render the 
            // camera right now; this image will remain on screen 
            // until normal rendering resumes.

            mainCamera.Render();

        }

      // We're done! The game will stop after this frame, and will 
      // wake back up when TouchEventReceivedFromNativeCode is 
      // called.

    }

    private void LeaveLowPowerMode()
    {
        // Restore the number of remaining frames before we stop 
        // again
        framesRemaining = instance.framesBeforePause;
        
        // Re-enable the camera so we resume interactive framerates
        if (mainCamera != null)
        {
            mainCamera.enabled = true;
        }
        
        // Restore the default player loop; the game will resume.
        PlayerLoop.SetPlayerLoop(PlayerLoop.GetDefaultPlayerLoop());
    }
}

We now need a way to receive a pointer to the C# method TouchEventReceivedFromNativeCode. This is done in the _AttachEventHook function, which is defined in native code and called from C#.

Add this to your .mm file:

// Declare the C++ type of the function pointer we'll receive from 
// C#.
typedef void (*EventDelegate)();

// Create a variable to store that function pointer.
static EventDelegate _eventDelegate = NULL;

// Declare that this function is a C function, and its name should not be mangled by the C++ compiler.

extern "C" {
  void _AttachEventHook(EventDelegate actionDelegate);    
}

// Finally, write the method that receives and stores the event 
// handling function pointer.
void _AttachEventHook(EventDelegate actionDelegate) {

    // Just keep it around.
    _eventDelegate = actionDelegate;
    
    // Log that it happened, too.
    NSLog(@"Event logging hook attached.");
}

We’re almost done. We now need to call the _eventDelegate function whenever a touch event lands.

Replace the sendEvent method in the TouchDetectingApplication class with this:

- (void)sendEvent:(UIEvent *)event {
    
    // Handle touch event here!
    _eventDelegate();
    
    [super sendEvent:event];
}

Finally, we need to tell the sytem to use the TouchDetectingApplication class, instead of the default UIApplication class.

Important note: While Unity will automatically copy any .mm and .h files into your project when you build the player, it will not do this step for you. Additionally, when you choose to do a build that Replaces the player (as opposed to Appending it), it will blow this change away! Fortunately, it’s a single code change, but you do need to remember to do it.

Open the main.mm file, and find the following line of code:

UIApplicationMain(argc, argv, nil, [NSString stringWithUTF8String: AppControllerClassName]);

Replace it with this:

UIApplicationMain(argc, argv, @"TouchDetectingApplication", [NSString stringWithUTF8String: AppControllerClassName]);

The application will now use that class for its UIApplication, and it will send wake-up prompts when touch events occur!

Wrapping up

This technique is extremely useful for building apps that don’t need to re-draw the screen all the time. If you use it, let us know!

📚 Unity Game Development Cookbook (1st Edition)

Our latest book covers everything you need to know about building games with Unity.

The book is available online, and in good bookstores. It was originally released in April 2019, and we consider it to still be current.

We really hope that you enjoy it! Please contact us if you have any questions or need a little help. We’ll try our best to get back to you.

It might take a few days sometimes, but if you get stuck get in touch. We also try and keep the code that’s available up to date.

Buy the book

Download code

You can download the resources for the Unity Game Development Cookbook (1st Edition) from GitHub:

How Night in the Woods Uses Yarn Spinner

We recently announced that we’re building Night in the Woods for mobile! We’re super excited about this, so we thought that we’d share a bunch of technical behind-the-scenes stuff on our blog over the coming weeks and months. This is the first of those posts! 

Yarn Spinner is the dialogue engine that we wrote, and was used in Night in the Woods. It’s open source, freely available, and we think it’s pretty powerful.

One of the reasons why we think Yarn Spinner is powerful is that it’s designed to focus on as little as possible. There’s literally only three things that Yarn Spinner can do with your game: it can send lines of dialogue, ask for a selection among a group of options, and send a command for your game to interpret.

The idea behind this is that your game can add what it needs to on top, rather than being forced to fit inside the ideas that we had when we first wrote the system. There are several excellent dialogue systems that are designed to be very good at operating within the structure of a role-playing game, or a choose-your-own-adventure system (Twine is a great example of this last one), but for Yarn Spinner, we wanted the system to be more generalised, and able to be applied to a wide variety of games systems.

The consequence of doing that, however, is that a game needs to do more work to add the features that it needs. While we built Yarn Spinner with NITW in mind, there are several features that are quite specific to the game. 

In this post, we’ll highlight some of the cooler things that Alec Holowka, the lead developer of Night in the Woods, built on top of the Yarn Spinner system to support its needs.

Example Dialog

Here’s an example of the kind of dialogue that exists in Night in the Woods. Here’s Mae and Gregg, planning on going to Donut Wolf:

Gregg: They got pancakes now! 🙂
<<close&gt;&gt;
//angus walks across the screen and off the left//
<<walk Angus AngusOffLeft&gt;&gt;
<<wait 3&gt;&gt;
Angus: fine.
<<lookUp Mae&gt;&gt;
<<lookUp Gregg&gt;&gt;
Gregg: \o/ D:
Gregg: RIDE THE CHARIOT!
<<dilate Mae .85 .5&gt;&gt;
Mae: TO DONUT HELL!!! \o/
<<runNextLinesTogether 2&gt;&gt;
Mae: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOO!![/shake]
Gregg: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOO!![/shake]

This dialogue is part of the raw source code of Night in the Woods. When the scene loads, the dialogue attached to the characters is parsed; first, it’s parsed into a data structure called a parse tree, which is then converted into a simple binary representation. This process is quite similar to how other code gets compiled into a binary that can be run on a machine.

At the end of this process, the resulting bytecode for the above dialogue snippet is this:

Node GreggFQ4Intro:
     0   L0:
             RunLine         GreggFQ4Intro-0           
             RunCommand      close                           
             RunCommand      walk Angus AngusOffLeft                      
             RunCommand      wait 3                          
     5       RunLine         GreggFQ4Intro-1          
             RunCommand      lookUp Mae                      
             RunCommand      lookUp Gregg                      
             RunLine         GreggFQ4Intro-2         
             RunLine         GreggFQ4Intro-3        
    10       RunCommand      dilate Mae .85 .5                      
             RunLine         GreggFQ4Intro-4       
             RunCommand      runNextLinesTogether 2                      
             RunLine         GreggFQ4Intro-5      
             RunLine         GreggFQ4Intro-6     
    15       RunCommand      close                           
             RunCommand      irisOut 1 wait                      
             RunCommand      sectionTitle GreggFQ4Intro BeaCar                      
    18       Stop                                            

String table:
GreggFQ4Intro-0: 
    Gregg: They got pancakes now! 🙂 (GreggFQ4Intro:1)
GreggFQ4Intro-1: 
    Angus: fine. (GreggFQ4Intro:6)
GreggFQ4Intro-2: 
    Gregg: \o/ D: (GreggFQ4Intro:9)
GreggFQ4Intro-3: 
    Gregg: RIDE THE CHARIOT! (GreggFQ4Intro:10)
GreggFQ4Intro-4: 
    Mae: TO DONUT HELL!!! \o/ (GreggFQ4Intro:12)
GreggFQ4Intro-5: 
    Mae: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOOOOOOO!![/shake] (GreggFQ4Intro:14)
GreggFQ4Intro-6: 
    Gregg: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOOOOOOO!![/shake] (GreggFQ4Intro:15)

Each line of this bytecode is then executed, one after another, by Yarn Spinner, which sends along the lines, options and commands that it encounters. This is part of the standard behaviour of how Yarn Spinner works in any game; where Night in the Woods differs is what it does with its lines.

Character Names

Night in the Woods shows its dialogue in speech balloons that are attached to the characters. In order to correctly position them, NITW needs to know which character a line should be shown attached to; to figure that out, NITW looks at the start of each line, and figures out if it begins with a name followed by a colon.

If it does, then the game checks to see if that character is in the scene; if they are, the speech balloon for the line is attached to that character.

Emoticons

Emoticons, or “what we used to use before emoji were a thing”, are those little faces that are composed out of plain text – stuff like 🙂 and :(. Night in the Woods uses these to control player expressions, by looking for any emoticons that it recognises. When it encounters one, it triggers a corresponding animation action, such as animating from the neutral expression to a smile.

There are several different kinds of emoticons. In addition to the smileys that control facial expressions, the game also includes several gestures as well; for example, whenever Mae puts her hands on her hips, it’s because the line included the emoticon “<o>”; when she throws her hands up into the air, it’s triggered by “\o/“.

This is true for other characters as well. When Gregg starts flailing his arms after Mae meets him in the Snack Falcon at the start of the game, it’s the result of using the emoticon sequence “:) \o/“

Markup

Night in the Woods responds to custom syntax within lines that let a writer mark up the dialog for visual effects. For example, when a character needs to shout, the line contains markup like this: 

Mae: {width=8}[shake=.05]AWOOOOOOOOOOOOOOOOO!![/shake]

This causes the specific range of letters to be delivered in a shaky style. The markup also supports effects like character size, color, and balloon position, as well as fun effects like making a line of text wave.

Commands

Finally, Night in the Woods makes extensive use of commands. In Yarn Spinner, if you wrap some text in <<angle brackets>>, it won’t be sent to the game like a line of dialogue. Instead, it’s delivered as a command; the intent is that the game will manually parse its contents, and perform some action in the game.

Every time you see a character do anything besides talk, it’s the result of an action. There are a huge number of possible commands in the game; on top of the usually expected actions like sit, stand, walk, and jump, commands can also be used to control where and how a character is looking, the dilation of their eyes, and whether they’re visible or not. This last command is extremely important, since it’s used to control whether a character is present in a scene on a given day.

What’s interesting about NITW is that just about all of the gameplay logic is driven through Yarn Spinner. The language wasn’t designed to be a general game logic control system but it turns out it’s pretty good at this. That’s cool.

Taking it further

Several of these are very specific to NITW and its systems, while others are things that could probably apply to most games. As we continue to work on Yarn Spinner, we’re looking forward to making the project the best dialogue system it can be. For more information on Yarn Spinner, visit the project’s page on GitHub at https://github.com/thesecretlab/YarnSpinner.

If you’re interested in more information on how we made Night in the Woods better with our open source software, check out Jon’s talk from GDC 2017 🎬.

Follow us on Twitter for more news and updates: @thesecretlab