It’s a bit tricky to see where to begin with OpenAI’s (in)famous GPT-2 model. This blog post is our first in a small series about NLP. We hope it helps!
Getting Python
Our preferred way of installing and managing Python, particularly for machine learning tasks, is to use the Anaconda Environment.
⚠️ Anaconda’s environments don’t quite work like virtualenv, or other Python environment systems that you might be familiar with. They don’t store things in the location you specify, they store things in the system-wide Anaconda directory (e.g. on macOS in “/Users/USER/anaconda3/envs/”). Just remember that once you activate them, all commands are inside the environment.
Anaconda bundles a package manager, an environment manager, and a variety of other tools that make using and managing Python environments easier.
Once you’ve installed Anaconda, following these instructions to make an Anaconda Environment to use with GPT-2.
➡️ Create a new Anaconda Environment named GPT2 and running Python 3.x (the version of Python you need to be running to work with GPT-2 at the moment):
➡️ While inside the activated Conda environment you’re using, install the requirements specified inside the GPT-2 repository:
pip install --upgrade -r requirements.txt
➡️ Use the model downloading script that comes with the GPT-2 repository to download a pre-trained model:
python3 download_model.py 774M
⚠️ You can specify which of the various GPT-2 models you want to download. Above, we’re requesting a download of the 774 million parameter model, which is about 3.1 gigabytes. You can also request the smaller 124M or 355M models, or the much larger 1.5B model.
You might need to wait a little while as the script downloads the model. You’ll see something like this:
➡️ You’ll then need to open the file src/interactive_conditional_samples.py using your favourite programming editor, and update the model_name to the one that you downloaded:
This will fire up the model, allowing you to enter some text. You’ll eventually (likely after some warnings that you can ignore about your CPU and the version of TensorFlow) see something like this:
Model prompt >>>
Enter some text and press return (we recommend only a sentence or two), then wait a bit and see what the model generates in response. This could take a while, depending on your computer’s capabilities.
We’ll be back with a follow-up article, exploring how you can actually use GPT-2 for something useful. Stay tuned!
For more AI content, check out our latest book, Practical Artificial Intelligence with Swift! It covers using Swift for AI in iOS applications, using Apple’s CreateML, CoreML, and Turi Create. If you like filling your brain with words, why not fill them with ours?
This post serves as both a follow-up to that session (which was recorded, and will be posted soon — we’ll update this post when that happens) and a standalone guide and tutorial to get started with Swift for TensorFlow.
We’ll be posting follow-up tutorials, which will get more advanced, over the coming weeks. (In the mean time, check out our new book on Practical Artificial Intelligence with Swift!)
Getting Swift for TensorFlow
There are two ways to get Swift for TensorFlow that we’d recommend right now. The first is to use Google’s Colaboratory (Colab), an online data science and experimentation platform, which means you use it via a browser and a Jupyter Notebooks-like environment.
The second is to install it locally, using Docker.
If you use Windows, we recommend using Google Colab, and if you use Linux or macOS, we recommend installing using the Docker image (it’s much easier than Docker’s reputation might suggest!)
Installing Swift for TensorFlow with Docker
➡️ First, make a folder on your local system in which to store your Swift Jupyter notebooks. For example, mine is located at /Users/parisba/S4TF/notebooks. You don’t need to put anything in there, just make sure you’ve created it.
We’re not going to explain this process much, because once it’s done you don’t need to think about Docker or any of this process again. If you want to learn how Docker works, there are plenty of sources online.
➡️ Then, to launch the Docker container and Jupyter notebooks, execute the following command:
docker run -p 8888:8888 --cap-add SYS_PTRACE -v /path/to/books:/notebooks swift-jupyter
⚠️ Note that you will need to replace the /path/to/books in the above with the path to folder on your local system that you created earlier.
➡️ Open the URL that is displayed in your terminal, similar to the following:
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://0.0.0.0:8888/?token=6693795258c11e5f22280811ddebd714267e1e662d66068e
➡️ You should see something that looks like the following screenshot:
➡️ You’re ready to go!
Using Google Colaboratory
You don’t need to do much to use Google Colaboratory!
In this example, we assemble a multilayer peceptron network that can perform XOR.
It’s not very useful, but it showcases how you build up a model using layers, and how to execute training with that model. XOR was one of the first stumbling blocks of early work with artificial neural networks, which makes it a great example for the power of modern machine learning frameworks.
It’s simple enough that you know whether it’s correct… which is why we’re doing it!
➡️ Create a new notebook, and import the TensorFlow framework:
import TensorFlow
To represent our XOR neural network model, we need to create a struct, adhering to the Layer Protocol (which is part of Swift For TensorFlow’s API). Ours is called XORModel.
Inside the model, we want three layers:
an input layer, to take the input
a hidden layer
an output layer, to provide the output
All three layers should be a Dense layer (a densely-connected layer) that takes an inputSize and an outputSize.
The inputSize specifies that the input to the layer is of that many values. Likewise outputSize, for the out of the layer.
Each will have an activation using an activation function determines the output shape of each node in the layer. There are many available activations, but ReLU and Sigmoid are common.
For our three layers, we’ll use sigmoid.
We’ll also need to provide a definition of our @differentiablefunc, callAsFunction(). In this case, we want it to return the input sequenced through (passed through) the three layers.
Helpfully, the Differentiableprotocol that comes with Swift for TensorFlow has a method, sequenced() that makes this trivial.
➡️ And we need to label the training data so that we know the correct outputs:
let trainingLabels: Tensor<Float> = [[0], [1], [1], [0]]
➡️ To train, we’ll need a hyperparameter for epochs:
let epochs = 100_000
Then we need a training loop. We train the model by iterating through our epochs, and each time update the gradient (the 𝛁 symbol, nabla, is often used to represent gradient). Our gradient is of type TangentVector, and represents a differentiable value’s derivatives.
Each epoch, we set the predicted value to be our training data, and the expected value to be our training data, and calculate the loss using meanSquaredError().
Every so often we also want to print out the epoch we’re in, and the current loss, so we can watch the traning. We also need to return loss.
Finally, we need to use our optimizer to update the differentiable variables, along the gradient.
➡️ To do this, add the following code:
for epoch in 0..<epochs
{
let 𝛁model = model.gradient { model -> Tensor<Float> in
let ŷ = model(trainingData)
let loss = meanSquaredError(predicted: ŷ, expected: trainingLabels)
if epoch % 5000 == 0
{
print("epoch: \(epoch) loss: \(loss)")
}
return loss
}
optimiser.update(&model, along: 𝛁model)
}
➡️ Run the notebook! You should see something resembling the following output:
➡️ Congratulations! You just trained a machine learning model that can, badly, perform XOR.
We’ll be posting more Swift for TensorFlow material in the coming weeks! 🚀
For more Swift AI content, check out our latest book, Practical Artificial Intelligence with Swift! It covers using Swift for AI in iOS applications, using Apple’s CreateML, CoreML, and Turi Create. If you like filling your brain with words, why not fill them with ours?
If you want to learn a little more about Swift for TensorFlow, we recommend this session from TensorFlow World as a great starting point:
Unity ML-Agents is a great way to explore machine learning, whether you’re interested in building AI for games, or simulating an environment to solve a broader ML problem, why not try Unity’s ML-Agents?
We’ll be posting a variety of guides and material covering various aspects of Unity’s ML-Agents, but we thought we’d start with an installation guide!
To use ML-Agents, you’ll need to install three things:
Unity
Python and ML-Agents (and associated environment and support)
The ML-Agents Unity project
Unity
Installing Unity is the easiest bit. We recommend downloading and using the official Unity Hub to manage your installs of Unity.
The Unity Hub allows you to manage multiple installs of different versions of Unity, and lets you select which version of Unity you open and create projects with.
⚠️ We recommend installing Unity 2019.2.4f1 for the tutorial at O’Reilly AI Conference. If you install a different version, we might not be able to help you.
If you don’t want to use the Unity Hub, you can download different versions of Unity for your platform manually:
We strongly recommend that you use the Unity Hub to manage your Unity installs, as it’s the easiest way to stick to a specific version of Windows, and manage your installs. It really makes things easier.
If you like using command line tools, you can also try the U3d tool to download and manage Unity install’s from the terminal.
Python and ML-Agents
Our preferred way of installing and managing Python, particularly for machine learning tasks, is to use the Anaconda Environment.
⚠️ Anaconda’s environments don’t quite work like virtualenv, or other Python environment systems that you might be familiar with. They don’t store things in the location you specify, they store things in the system-wide Anaconda directory (e.g. on macOS in “/Users/USER/anaconda3/envs/”). Just remember that once you activate them, all commands are inside the environment.
Anaconda bundles a package manager, an environment manager, and a variety of other tools that make using and managing Python environments easier.
Once you’ve installed Anaconda, following these instructions to make an Anaconda Environment to use with Unity ML-Agents.
➡️ First, download 🔗 this yaml file, and execute the following command (pointing to the yaml file you just downloaded):
conda env create -f /path/to/unity_ml.yaml
➡️ Once the new Anaconda Environment (named UnityML) has been created, activate it using the following command in your terminal:
conda activate UnityML
The yaml file we provided specifies all the Python packages, from both Anaconda’s package manager, as well pip, the Python package manager, that you need to make an environment that will work with ML-Agents.
Doing it manually
You can also do this manually (instead of asking Anaconda to create an environment based on our environment file).
⚠️ You do not need to do this if you created the environment with the yaml file, as above. If you did that go straight to “Testing the environment”, below.
➡️ Create a new Anaconda Environment named UnityML and running Python 3.6 (the version of Python you need to be running to work with TensorFlow at the moment):
conda create -n UnityML python=3.6
➡️ Activate the Conda environment:
conda activate UnityML
➡️ Install TensorFlow 1.7.1 (the version of TensorFlow you need to be running to work with ML-Agents):
pip install tensorflow==1.7.1
➡️ Once TensorFlow is installed, installing the Unity ML-Agents:
pip install mlagents
Testing the environment
➡️ To check everything is installed properly, run the following command:
mlagents-learn --help
You should see something that looks like the following image. This shows that everything is installed properly.
If you’re coming to our conference tutorial, you’re now ready to go.
The ML-Agents Unity Project
The best way to start exploring ML-Agents is to use their provided Unity project. To get it, you’ll need a copy of the Unity ML-Agents repository.
⚠️ You do not need to do this bit if you’re coming to our tutorial at the O’Reilly AI Conference. We will provide a project on the day.
➡️ Clone the Unity ML-Agents repository to your system (see the note below if you’re coming to our tutorial!):
⚠️ If you’re coming to our O’Reilly AI Conference tutorial, we will provide a project on the day.
You should now have a directory called ml-agents. This directory contains the source code for ML-Agents, a whole of lot useful configuration files, as well starting point Unity projects for you to use.
➡️ You’re ready to go! If you’re coming to our tutorial, you’ll need a slightly different project which we’ll help you out with on the day!
We’ll have another article on getting started (now that you’ve got it installed) next week!
Unity ML-Agents is a great way to explore machine learning, whether you’re interested in building AI for games, or simulating an environment to solve a broader ML problem, why not try Unity’s ML-Agents?
We’ll be posting a variety of guides and material covering various aspects of Unity’s ML-Agents, but we thought we’d start with an installation guide!
The Unity Hub allows you to manage multiple installs of different versions of Unity, and lets you select which version of Unity you open and create projects with.
If you don’t want to use the Unity Hub, you can download different versions of Unity for your platform manually:
We strongly recommend that you use the Unity Hub to manage your Unity installs, as it’s the easiest way to stick to a specific version of Windows, and manage your installs. It really makes things easier.
If you like using command line tools, you can also try the U3d tool to download and manage Unity install’s from the terminal.
Python and ML-Agents
Our preferred way of installing and managing Python, particularly for machine learning tasks, is to use the Anaconda Environment.
⚠️ Anaconda’s environments don’t quite work like virtualenv, or other Python environment systems that you might be familiar with. They don’t store things in the location you specify, they store things in the system-wide Anaconda directory (e.g. on macOS in “/Users/USER/anaconda3/envs/”). Just remember that once you activate them, all commands are inside the environment.
Anaconda bundles a package manager, an environment manager, and a variety of other tools that make using and managing Python environments easier.
Once you’ve installed Anaconda, following these instructions to make an Anaconda Environment to use with Unity ML-Agents.
➡️ First, download 🔗 this yaml file, and execute the following command (pointing to the yaml file you just downloaded):
conda env create -f /path/to/unity_ml.yaml
➡️ Once the new Anaconda Environment (named UnityML) has been created, activate it using the following command in your terminal:
conda activate UnityML
The yaml file we provided specifies all the Python packages, from both Anaconda’s package manager, as well pip, the Python package manager, that you need to make an environment that will work with ML-Agents.
Doing it manually
You can also do this manually (instead of asking Anaconda to create an environment based on our environment file).
⚠️ You do not need to do this if you created the environment with the yaml file, as above. If you did that go straight to “Testing the environment”, below.
➡️ Create a new Anaconda Environment named UnityML and running Python 3.6 (the version of Python you need to be running to work with TensorFlow at the moment):
conda create -n UnityML python=3.6
➡️ Activate the Conda environment:
conda activate UnityML
➡️ Install TensorFlow 1.7.1 (the version of TensorFlow you need to be running to work with ML-Agents):
pip install tensorflow==1.7.1
➡️ Once TensorFlow is installed, installing the Unity ML-Agents:
pip install mlagents
Testing the environment
➡️ To check everything is installed properly, run the following command:
mlagents-learn --help
You should see something that looks like the following image. This shows that everything is installed properly.
The ML-Agents Unity Project
The best way to start exploring ML-Agents is to use their provided Unity project. To get it, you’ll need a copy of the Unity ML-Agents repository.
➡️ Clone the Unity ML-Agents repository to your system (see the note below if you’re coming to our OSCON tutorial!):
You should now have a directory called ml-agents. This directory contains the source code for ML-Agents, a whole of lot useful configuration files, as well starting point Unity projects for you to use.
➡️ You’re ready to go! If you’re coming to our OSCON tutorial, you’ll need a slightly different project which we’ll help you out with on the day!
We’ll have another article on getting started (now that you’ve got it installed) next week!