Gobble

When I discovered David Fotland 's The Many Faces of Go in the end of '92, I was utterly fascinated by the problem to program go. For example, why does an excellent program like Many Faces have problems doing local tree searches? So I wrote my own local problem solver, and now I know why: too many positions to consider.

Prompted by my work in physics, I have completed a program, Gobble, playing 9x9 go with minimal go knowledge using a Monte Carlo method called simulated annealing (PostScript format report and figures; or gzipped tar achive of both). Annealing is nature's trick to find extrema in very complicated situations. Simulated annealing has proven to be a simple but powerful stochastic method for large scale combinatorial optimization.

In brief, to find the minimum of some function, don't go exclusively downhill, which for a complicated function is likely to leave you at a useless local minimum. Rather, go uphill, too, from time to time, determined by a random number generator and a control parameter, the "temperature", therefore the label stochastic.

For practical purposes, simulated annealing has solved the famous traveling salesman problem: find the shortest of N! paths connecting N cities. Simulated annealing finds a very good approximation to the shortest path out of the huge number of all possible paths. It is now used to arrange connections on chips and switching devices in telephone networks.

So there is this method that is trumpeted as very simple but extremely good at large scale combinatorial problems. Want a really large optimization problem? How about applying simulated annealing to finding the most advantageous tree in the space of entire game trees of go?!

There are three elements in go (and chess ...) programming:

look-ahead
evaluation
rules to combine (1) and (2) (for selective tree search etc.)

Here is what Gobble does, which has to be contrasted with standard approaches:

play games to the very end (vs. a few halfmoves only)
evaluation is very simple: score the final position (vs. sophisticated expert systems requiring a lot of go skills which I don't have anyway as a 15k)
play games with random moves with probabilities determined by the simulated annealing algorithm depending on the final score of previous games (vs. local extension of search based on evaluation)

So, put simply, all that Gobble does is play a few thousand random games to the very end and determine the value of a first move from the average value it has in these random games.

There is one crucial point in this construction which requires real theoretical work and may determine the overall usefulness of this approach. Notice that trees are not paths, that to optimize is not to `minimax'. The salesman is just one player with one goal, to minimize his path length, while in go there are two players working against each other. Black wants to maximize his score while White wants to minimize Black's score. So at the core of my Monte Carlo go algorithm is a map from trees to paths, which is only approximate but with accuracy increasing with computing time. (This map seems to work for go, but I do not expect it to be useful for chess. I don't know of any similar work for other games.)

Let me emphasize that up to Gobble 1.0 (May 1993) no go heuristics has been included except that Gobble does not fill in empty fields which may become eyes, which is in stark contrast to the standard approach which relies heavily on go knowledge. Wouldn't it be nice if all you had to do is tell the computer the rules and it plays a perfect game?

I haven't seen any idea even remotely as crazy for go programming, and I haven't seen many go programs playing worse, either. Against an old public domain version of Many Faces (16k) at level 10 Gobble 1.0 needs at least a 2 stone handicap to play even, so I estimate Gobble 1.0 plays at around 25k on a 9x9 board.

Motivated by the newly setup computer go ladder, I have added two go heuristics to help Gobble with its two most glaring weaknesses. Gobble 1.5 (December 1994) has a simple influence function to help it claim more territory in the beginning of the game, and a library of 4x4 patterns derived from a few thousand >=1d games from IGS, so that Gobble's moves at least locally `look' better. Both these heuristics are still quite primitively implemented, but they help Gobble to choose between several moves that the simulated annealing evaluation declares to be good.

Whatever one can do deterministically will be much more efficient than stochastic methods, and I am certainly not a simulated-annealing / no-go-knowledge fanatic. Results count. Just remember, when you are seeing Gobble play on IGS in the future, it might have been one of your games that made Gobble play a particular (stupid or clever) pattern.