It’s been some time since Google Deepmind astounded the world with its computer program AlphaGo, which managed to soundly defeat champion human go players. AlphaGo, which was trained on data obtained from high level human-played games, was followed by AlphaGoZero, which became the world’s strongest Go player in 40 days and 40 nights (hmmm….) by learning the game entirely through self-play without any human knowledge or prior input. The principles of AlphaGoZero were generalized in another engine, AlphaZero which was applied to the games of Chess and Shogi
Since then, a number of open source implementations of some or all of the techniques used in AlpaGoZero have been released and made available to the public at large. These have been developed for a variety of reasons: for repeatability of the original concept, for educational purposes, or to demonstrate some improvement or simplification of the original algorithm.
For AI scientists, experimenters, and gaming enthusiasts, these allow for varying degrees of evaluating or extending the core capabilities of using AlphaGoZero-like software.
I’ll list the most prominent releases and briefly describe the capabilities offered. If there are others, please let me know and I will include them in an update to this page.
Build Your own AlphaZero
First off is the DIY effort from David Foster, one of the founders of Applied Data Science Partners. His Medium article describes how AlphaZero works, (including an infographic) and he walks through his code implementation, which has been made available on Github. His implementation learns and plays the game Connect-4, rather than Go, and the code is Python 3, Tensorflow and Keras and is presented in a Jupyter notebook, making it beginner friendly (not easy, mind you, just friendly). The code that implements Connect-4 is in a file of its own, making it possible to implement other games in its stead.
LeeleaZero is a community effort in which training is performed by users downloading the compiled software and a set of neural network weights, running it, and returning the new set of learned weights to a central server. The source code, in C++, is freely available so you could conceivably start from zero and train your own version, though that is so resource intensive as to be unfeasible.
LeelaZero has been modified to play chess (called LeelaChessZero) by the one of the developers of Stockfish, one of the strongest chess playing programs available. . In order to modify LeelaZero to play any other games, a deep dive into the C++ code on github would be required. There is little to no documentation on the code so this is requires an expertise in C++ and the AlphaZero theory of operation. There is a discord server and a forum, so there may be a modicum of help from those sources. Training AlphaZero to play a new game from scratch could be very resource intensive, the authors estimate that training LeelaZero to play Go at the level of AlphaGoZero would take 1700 years on commodity hardware.
Next up is KataGo, which “was trained using an AlphaZero-like process with many enhancements and improvements”. Katago was designed to be an AlphaGoZero-like program with improvements to reduce training time and hardware requirements compared to AlphaGo. These improvements resulted in an academic paper. KataGo is mostly C++ code and it plays Go using the Go Text Protocol, so adapting it to play other games will be tricky.
Agogo is an implementation of AlphaZero written in Go (the programming language) by the team that created Gorgonia, a library for doing deep learning in the Go language. Documentation is pretty much non-existent (except for a video presentation) so any modifications will require some fluency in the Go language.
Galvanise Zero (aka gzero) is described as a General Game Player that drew inspiration and learning techniques from AlphaGoZero. The codebase is C++, Python 2.7, and Tensorflow. Since the game it trains for is described in the Game Description Language (GDL) , any game that can be captured in GDL (that would be almost all combinatorial abstracts and many more) can be played by Galvanise Zero. Indeed, the engine has already been trained in a number of abstracts. You can play against the program on the Little Golem game site, where it has been posted as the gzero bot player.
The code is minimally documented however, so if you want to train it to play your own games, your work is cut out for you. None-the-less, the developer welcomes participation so maybe you can convince him to give you a hand.
AlphaZero.jl is implemented in the Julia programming language. The software is the result of a government contract with HRL Laboratories for assured autonomy AI software. As such is is well documented. The open source version is demonstrated using Connect-4 and claims to be faster than most Python implementations by virtue of the Julia language and some improvements HRL Laboratories made for efficiency. The code has all the game code in a separate module, allowing a Julia-fluent programmer the ability to modify it for other games instead of the stock Connect-4
Facebook ELF OpenGo
The Facebook AI Research arm has also joined the fray. they have released a reimplementation of AlphaGoZero called ELF OpenGo. It is based on their ELF (ELF is for Extensive , Lightweihgt and Flexible) reinforcement learning platform. The code is a combination of C++, Python, and Pytorch. Documentation is minimal. Facebook AI is upfront about not guaranteeing this software will work for anything other than the exact environment used to demonstrate it. Clearly, anyone that wishes to tinker is on their own.
My interest lies in extending the capability of the AlphaZero algorithm to other games, hence I made observations as to how easy (or not) that would be in the descriptions above.
As a first step, I plan to experiment with the Applied Data Science repository, and maybe others afterwards. I’ll report back with any findings. I’d love to hear from people that have used any of these, and in particular, if you were able to train the engine on a different game.