<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title type="text">Gábor Melis' () blog</title>
  <subtitle type="text">[No_subtitle]</subtitle>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <id>http://quotenil.com/</id>
  <link rel="alternate" type="text/html" hreflang="en" href="http://quotenil.com/" />
  <link rel="self" type="application/atom+xml" href="http://quotenil.com/atom.xml" />
  <rights>Copyright (c) 2011 Gábor Melis</rights>
  <generator uri="http://www.cognition.ens.fr/~guerry/u/blorg.el" version="0.75e">
    Done with blorg 0.75e -- org-mode 6.33x and GNU Emacs 23.2.1
  </generator>

<entry>
  <title>Hung Connections</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Hung-Connections.html"/>
  <id>http://quotenil.com/Hung-Connections.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2011-02-27T00:00:00+01:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
My ISP replaced a Thomson modem with a Cisco EPC3925 modem-router to
fix the speed issue I was having. The good news is that the connection
operates near its advertised bandwidth, the bad news is that tcp
connections started to hang. It didn't take long to <a href="http://hup.hu/node/98496">find out</a> that this
particular router drops "unused" tcp connections after five minutes.
</p>


<p>
The fix recommended in the linked topic (namely sysctl'ing
<code>net.ipv4.tcp_keepalive_time</code> & co) was mostly effective but I had to
lower the keepalive to one minute to keep my ssh sessions alive. The
trouble was that OfflineIMAP connections to the U.S. west coast still
hanged intermittently while it could work with Gmail just fine.
</p>


<p>
In the end, OfflineIMAP had to be <a href="http://permalink.gmane.org/gmane.mail.imap.offlineimap.general/2815">patched</a> to use the keepalive and the
keepalive be lowered to 15s:
</p>


<p>
<pre>
 sysctl -w net.ipv4.tcp_keepalive_time=15 \
           net.ipv4.tcp_keepalive_intvl=15 \
           net.ipv4.tcp_keepalive_probes=20
</pre>
</p>



<p>
Oh, and always include <code>socktimeout</code> in the offlineimap config, that's
more important than keepalive unless you never have network issues.
</p>
    </div>
  </content>
</entry>


<entry>
  <title>OfflineIMAP with Encrypted Authinfo</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/OfflineIMAP-with-Encrypted-Authinfo.html"/>
  <id>http://quotenil.com/OfflineIMAP-with-Encrypted-Authinfo.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2011-02-26T00:00:00+01:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
I've moved to an <a href="http://offlineimap.org/">OfflineIMAP</a> + <a href="http://gnus.org/">Gnus</a> setup that's outlined at <a href="http://sachachua.com/blog/2008/05/geek-how-to-use-offlineimap-and-the-dovecot-mail-server-to-read-your-gmail-in-emacs-efficiently/">various</a>
<a href="http://nakkaya.com/2010/04/10/using-offlineimap-with-gnus/">places</a>. Gnus can be configured to use <a href="http://www.emacswiki.org/emacs-en/GnusAuthinfo">~/.authinfo</a> as a netrc style of
file to read passwords from and can easily use <a href="http://www.emacswiki.org/emacs-en/GnusEncryptedAuthInfo">encrypted authinfo</a>
files as well. Offlineimap, on the other hand, offers no such support
and passwords to the local and remote imap accounts are normally
stored in clear text in <code>.offlineimaprc</code>.
</p>


<p>
For the local account this can be overcome by not running a dovecot
server but making offlineimap spawn a dovecot process when needed:
</p>


<p>
<pre>
 [Repository LocalGmail]
 type = IMAP
 preauthtunnel = /usr/sbin/dovecot -c ~/.dovecot.conf --exec-mail imap
</pre>
</p>



<p>
For the remote connection, ideally it should read the password from
<code>.authinfo.gpg</code> that Gnus may also read if it's configured to access
the remote server directly. This can be pulled off rather easily. Add
an <i>include</i> to <code>.offlineimaprc</code> like this:
</p>


<p>
<pre>
 [general]
 pythonfile = ~/.offlineimap.py
</pre>
</p>



<p>
where <code>~/.offlineimap.py</code> just defines a single function called
<code>get_authinfo_password</code>:
</p>


<p>
<pre>
 #!/usr/bin/python
 import re, os
 
 def get_authinfo_password(machine, login, port):
     s = "machine %s login %s password ([^ ]*) port %s" % (machine, login, port)
     p = re.compile(s)
     authinfo = os.popen("gpg -q --no-tty -d ~/.authinfo.gpg").read()
     return p.search(authinfo).group(1)
</pre>
</p>



<p>
Now, all that's left is to change remotepass to something like this:
</p>


<p>
<pre>
 remotepasseval = get_authinfo_password("imap.gmail.com", "username@gmail.com", 993)
</pre>
</p>



<p>
Of course, <code>.authinfo.gpg</code> should also have the corresponding entry:
</p>


<p>
<pre>
 machine imap.gmail.com login username@gmail.com password <password> port 993
</pre>
</p>



<p>
That's it, no more cleartext passwords.
</p>
    </div>
  </content>
</entry>


<entry>
  <title>Alpha-beta</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Alpha-beta.html"/>
  <id>http://quotenil.com/Alpha-beta.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-12-27T00:00:00+01:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
It hasn't been a year yet since I first promised that alpha-beta
snippet and it is already added to micmac in all its <a href="http://quotenil.com/git/?p=micmac.git;a=blob;f=src/game-theory.lisp;h=9e38d1f09a1f443bc14e115f9abc68dd5f64f6f3;hb=e2de0e888ce103b026d725297f52f5710273a5c3#l71">35 line glory</a>.
The good thing about not rushing it out the door is that it saw more a
bit more use. For a tutorialish tic-tac-toe example see
<a href="http://quotenil.com/git/?p=micmac.git;a=blob;f=test/test-game-theory.lisp;h=f6c77e7a3104993c5bc1e01b75a4c94a1d6489e9;hb=e2de0e888ce103b026d725297f52f5710273a5c3#l18">test/test-game-theory.lisp.</a>
</p>


<p>
The logging code in the example produces <a href="images/alpha-beta-log.png">output</a> suitable for cut and
pasting into an org-mode buffer and exploring it by TABbing into
subtrees to answer the perpetual 'What the hell was it thinking?!'
question.
</p>
    </div>
  </content>
</entry>


<entry>
  <title>Nash equilibrium finder</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Nash-equilibrium-finder.html"/>
  <id>http://quotenil.com/Nash-equilibrium-finder.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-12-26T00:00:00+01:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
While I seem to be unable to make my mind up on a good interface to
alpha-beta with a few bells and whistles, I added a Nash equilibrium
finder to <a href="http://cliki.net/micmac">Micmac</a> that's becoming less statistics oriented. This was
one of the many things in Planet Wars that never really made it.
</p>


<p>
Let's consider the <a href="http://en.wikipedia.org/wiki/Matching_pennies">Matching pennies</a> game. The row player wins iff the
two pennies show the same side. The payoff matrix is:
</p>


<p>
<pre>
 |       | Heads | Tails |
 +-------+-------+-------+
 | Heads |     1 |    -1 |
 | Tails |    -1 |     1 |
</pre>
</p>



<p>
Find the mixed strategy equilibrium:
</p>


<p>
<pre>
 (find-nash-equilibrium '((-1 1) (1 -1)))
 =>
 #(49 51)
 #(50 50)
 -0.01
</pre>
</p>



<p>
That is both players should choose heads 50% of the time and the
expected payoff (for the row player) is zero of which -0.01 is an
approximation:
</p>


<p>
<pre>
 (find-nash-equilibrium '((-1 1) (1 -1)) :n-iterations 1000)
 =>
 #(499 501)
 #(500 500)
 -0.001
</pre>
</p>
    </div>
  </content>
</entry>


<entry>
  <title>Planet Wars Post-Mortem</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Planet-Wars-Post-Mortem.html"/>
  <id>http://quotenil.com/Planet-Wars-Post-Mortem.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-12-01T00:00:00+01:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
I can't believe I <a href="http://ai-contest.com/rankings.php">won</a>.
</p>


<p>
I can't believe I won <i>decisively</i> at all.
</p>


<p>
The lead in the last month or so was an indicator of having good
chances, but there was a huge shuffling of ranks in the last week and
some last minute casualties.
</p>

<h3>Code</h3>


<p>
Note that the git repository is available at
<a href="http://quotenil.com/git/planet-wars.git">http://quotenil.com/git/planet-wars.git</a> (<a href="http://quotenil.com/git/?p=planet-wars.git">gitweb</a>).
</p>

<h3>Denial</h3>


<p>
I had promised myself not to enter this one and resisted for about two
weeks when my defenses were worn away and I was drawn into the fray.
</p>


<p>
The game didn't look very exciting at first. I thought that the bots
would soon reach a point of near perfect tactics and the
rock-paper-scissors scenarios would dominate (more on this later).
</p>


<p>
That's enough of <a href="http://www.a1k0n.net/blah/archives/2010/03/index.html#e2010-03-04T14_00_21.txt">tribute</a>, let's steer off the trodden path.
</p>

<h3>Beginning</h3>


<p>
Driven by the first <a href="http://en.wikipedia.org/wiki/Larry_Wall#Virtues_of_a_programmer">virtue</a> of programmers I was going to approach the
game in a non-labor-intensive fashion leaving most of the hard work to
the machine. The second virtue was kept in check for a week while I
was working out how to do that exactly. In the meantime, the third
spurred me to take <a href="http://aerique.blogspot.com/">aerique's</a> starter pack and to make myself
comfortable with minor modifications to it.
</p>


<p>
As with tron, UCT was on my mind. However, I was keenly aware of
failing to make it work acceptably last time. No matter how cool UCT
was, it was hard to miss one important similarity to tron: the fitness
function is very jagged, one ship more or less can make all the
difference. Clearly, a naive random policy was not going to cut it.
</p>


<p>
Another problem was the practically unlimited branching factor.
Without a similarity function over moves it was hopeless to explore a
meaningful portion of the game tree.
</p>

<h3>Move generation</h3>


<p>
At this point I had to start getting my hands dirty. The first thing
was to implement simulating the future (see <code>FUTURE</code> class) that was
trivial except I screwed battle resolution up and for the longest time
it was holding results back. Think of a future as a vector of owner
and ship count over turns.
</p>


<p>
By watching some games it became apparent that multi-planet,
synchronized attacks are the way to go. The implementation operates on
step targets, steps and moves.
</p>


<p>
A <i>step</i> is a set of orders from the same player targeting the same
planet. The constituent orders need not be for the same turn, neither
do they need to arrive on the same turn.
</p>


<p>
A <i>move</i> is a set of orders from the same player without any
restriction. That includes future orders too.
</p>


<p>
Move generation first computes so called step targets. A <i>step target</i>
is a ship count vector over turns representing the desired arrivals.
The desired arrivals are simply minimal reinforcements for defense and
invasion forces for attack.
</p>


<p>
For each step target a number of steps can be found that produce the
desired arrivals. In the current implementation there is a single step
generated for a step target.
</p>


<p>
For a while my bot could only make moves that consisted of a single
step, but it quickly became the limiting factor and strength testing
of modifications was impossible.
</p>


<p>
Combining steps into moves turned out to be easy. Not all combinations
are valid, but the number of combinations can be huge. To limit the
number of moves generated we first evaluate steps one by one, sort
them in descending order of evaluation score and try to combine them
starting from the first.
</p>

<h3>Full attack</h3>


<p>
Normally futures are calculated taking into account fleets already in
flight in the observable game state that the engine sends. Back when I
was still walking up and down instead of typing away furiously it
occurred to me that if for all planets of player 1 player 2 cannot
take that planet if both players sent all ships to it then player 2
cannot take any planet of player 1 even if he's allowed to attack
multiple planets in any pattern. Clearly, this breaks down at the
edges (simultaneous moves), but it was a useful idea that gave birth
to the <code>FULL-ATTACK-FUTURE</code> class. The intention was to base position
evaluation on the sum of scores of individual full attack futures (one
per planet).
</p>


<p>
Now the problem with full attack future is that sending all available
ships away from a planet can invalidate some orders scheduled from
that planet for the future. Enter the concept (and class) of
<code>SURPLUS</code>.
</p>


<p>
The surplus of player P at planet A at time t is the number of ships
that can be sent away on that turn from the defending army without:
</p>

<ul>
<li>making any scheduled order from planet A invalid

</li>
<li>causing the planet to be lost anytime after that (observing only the fleets already in space)

</li>
<li>bringing an imminent loss closer in time</li>
</ul>


<p>
As soon as the full attack based position evaluation function was
operational results started to come. But there was a crucial
off-by-one bug.
</p>

<h3>Constraining futures</h3>


<p>
That bug was in the scoring of futures. For player 1 it used the
possible arrivals (number of ships) one turn before those of player 2.
I made several attempts at fixing it but each time playing strength
dropped like a stone.
</p>


<p>
Finally, a principled solution emerged: when computing the full attack
future from the surpluses constrain the turn of departures. That is,
to roughly duplicate the effect of the off-by-one bug, one could say
that surpluses of player 1 may not leave the planet before turn 1
(turn 0 is current game state) (see <code>MIN-TURN-TO-DEPART-1</code> in the
code). This provided a knob to play with. Using 1 for
<code>MIN-TURN-TO-DEPART-1</code> made the bot actually prefer moves to just
sitting idly, using 2 made it prefer moves that needed no
reinforcement on the next turn.
</p>


<p>
I believe this is the most important one character change I made so
this gets its own paragraph. Using 2 as <code>MIN-TURN-TO-DEPART-1</code> makes
the bot tend towards situations in which the rock-paper-scissors
nature of the game is suppressed. The same bot with 1 beats the one
with 2, but as was often the case, on tcp the results were just the
opposite. By a big margin. TCP is dhartmei's unofficial server is
where most useful testing took place.
</p>


<p>
Constraints were added for arrivals too (see <code>MIN-TURN-TO-ARRIVE</code>)
that eased scoring planets that started out neutral but were
non-neutral at the horizon by making the evaluation function sniping
aware.
</p>


<p>
Sniping is when one player takes a neutral losing ships in the process
and the opponent comes - typically on the next turn - and takes it
away. <a href="http://www.ai-contest.com/visualizer.php?game_id=9347535">This game</a> is a nice illustration of the concept.
</p>

<h3>Redistribution</h3>


<p>
As pointed out by <a href="http://iouri-khramtsov.blogspot.com/2010/11/google-ai-challenge-planet-wars-entry.html">iouri</a> in his post-mortem, redistribution of ships is
a major factor. The machinery described so far lends itself to easy
implementation of redistribution.
</p>


<p>
When scoring a full attack future the scoring function gives a very
slight positional penalty every simulated turn for every enemy ship.
This has the effect of preferring positions where the friendly ships
are near the enemy, and positions of influence with multiple enemy
planets being threatened.
</p>


<p>
The move generator was modified to generate steps to each friendly
planet from each friendly planet on turn 0 sending all the surplus at
that turn. This scheme is rather restrictive, the more flexible
solutions had mixed results.
</p>


<p>
There is a knob, of course, to control how aggressively ships are
redistributed. It's called <code>POSITIONAL-MIN-TURN-TO-DEPART-1</code>. As its
name implies it's like <code>MIN-TURN-TO-DEPART-1</code> but used only when
computing the positional penalty.
</p>

<h3>Dynamic horizon</h3>


<p>
How far ahead the bot looks has a very strong effect on its play: too
far and it will be blind to tactics, too close and it will miss
capturing higher cost neutrals.
</p>


<p>
Horizon was constant 30 for quite some time. I wanted to raise it but
couldn't without seriously hurting close range fighting ability. After
much experimentation with a slightly complicated mechanism the horizon
was set so that the three earliest breakeven turns of safe to take
neutrals are included. A neutral is deemed safe to take if from the
initial investment until the breakeven point no friendly planet can be
possibly lost in a full attack future.
</p>

<h3>Nash equilibrium</h3>


<p>
There are - especially at the very beginning of games - situations
were there is no best move, it all depends on what the opponent plays
on the same turn.
</p>


<p>
If one has a number of candidate moves for each player and the score
for any pair of them the optimal mixed strategy can be computed that's
just a probability assigned to each move.
</p>


<p>
I tried and tried to make it work but it kept making mistakes that
looked easy to exploit and although it did beat 1 ply minimax about 2
to 1 it was too slow to experiment with.
</p>

<h3>Alpha-beta</h3>


<p>
Yes, for the longest time it was a 1 ply search. Opponent moves were
never considered and position evaluation was good enough to pick up
the slack.
</p>


<p>
However, there was a problem. The evaluation function did not score
planets that were neutral at the end of the normal future, because
doing so made the bot just sit there doing nothing, getting high
scores for all planets that could be conquered but when it tried to
make a move it realized that it can conquer only one. Such is the
nature of full attack based evaluation function, it was designed with
complete disregard for neutrals.
</p>


<p>
The late change to the map generator increased the number of planets
at an equal distance from the players and emphasized the
rock-paper-scissors nature further. Some bots didn't like it, some
took this turn of events better. Before this point my bot had a very
comfortable lead on the official leaderboard which was greatly
reduced.
</p>


<p>
With the failure of the nash experiment I resurrected previously
unsuccessful alpha-beta code in hopes of that considering opponent
moves will show the bot the error of it ways and force it to not leave
valuable central planets uncovered.
</p>


<p>
It's tricky to make alpha-beta work with moves that consist of orders
at arbitrary times in the future. I had all kinds of funky, correct
and less correct ways to execute orders at different depths of the
search. In the end what prevailed was the most simple-minded,
incorrect variant that simply scheduled all orders that made up the
move (yes, even the future ones) and fixed things up when computing
the future so that ship counts stayed non-negative and sending enemy
ships did not occur.
</p>


<p>
In local test against older versions of my bot a two ply alpha-beta
bot showed very promising results but when it was tested on tcp it
fell way short of the expectations and performed worse than the one
ply bot. It seemed particularly vulnerable to a number of bots. In
retrospect, I think this was because their move generator was
sufficiently different that my bot was just blind to a good range of
real possibilities.
</p>


<p>
In the end, I settled for using four ply alpha-beta for the opening
phase (until the third planet was captured). This allowed the bot to
outwait opponents when needed and win most openings. After the final
submission I realized that maybe I was trying to push things the wrong
way and even three planets is too many. With six hours left until the
deadline in a test against binaries of a few fellow competitors the
two planet limit seemed to perform markedly better, but it was too
late to properly test it against a bigger population.
</p>

<h3>The End</h3>


<p>
Like many fellow contestants I am very happy that the contest is over
and I got my life back. I'm sure that many families breathed a
collective sigh of relief. But if I were to continue I'd try
rethinking the move generator, because that may just be the thing that
holds alpha-beta back and maybe nash too.
</p>


<p>
Dissapointingly, there was no learning, adapting to opponent
behaviour, etc. All that made it to the todo list, but had to take
second seat to more pressing concerns.
</p>


<p>
Ah, yes. One more thing. Bocsimackó (pronounced roughly as
bo-chee-mats-ko), after whom the bot was named, is the handsome hero
of a children's book, pictured on the left:
</p>

<img alt="malacka-es-bocsimacko.jpg" src="images/malacka-es-bocsimacko.jpg"/>
    </div>
  </content>
</entry>


<entry>
  <title>Important Update to the Planet Wars Starter Package</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Important-Update-to-the-Planet-Wars-Starter-Package.html"/>
  <id>http://quotenil.com/Important-Update-to-the-Planet-Wars-Starter-Package.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-10-25T00:00:00+02:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
First, is it possible to get something as simple as <code>RESOLVE-BATTLE</code>
wrong? Apparently, yes. That's what one gets for trying to port Python
code that's pretty foreign in the sense of being far from the way I'd
write it.
</p>


<p>
More importantly, I found out the hard way that sbcl 1.0.11 that's
<a href="http://code.google.com/p/ai-contest/issues/detail?id=183">still</a> on the official servers has a number of bugs in its timer
implementation making <code>WITH-TIMEOUT</code> unreliable. Also, it can trigger
timeouts recursively eventually exceeding the maximum interrupt
nesting depth. Well, "found out" is not the right way to put it as we
did fix most of these bugs ages ago.
</p>


<p>
In the new starter package (v0.8 in <a href="http://quotenil.com/git/?p=planet-wars.git;a=summary">git</a>, <a href="http://quotenil.com/binary/planet-wars/planet-wars-latest.tar.gz">latest tarball</a>), you'll find
timer.lisp that's simply backported almost verbatim from sbcl 1.0.41
to sbcl 1.0.11. Seems to work for me, but I also had to lower the
timeout to 0.8 from 0.98 because the main server is extremely slow.
</p>


<p>
The rate at which games are played on the servers is so low that it
takes several days to ascend through the leaderboard. Nevertheless, an
old buggy version is sitting on the <a href="http://ai-contest.com/rankings.php">top</a> right now. Mind you,
introducing bugs is a great way exlopore the solution space and it's
quite worrisome just how adept I am at this poor man's evolutionary
programming. Most of them have since been fixed while the ideas they
brought to light remain, making the current version much stronger.
</p>
    </div>
  </content>
</entry>


<entry>
  <title>Planet Wars Common Lisp Starter Package Actually Works</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Planet-Wars-Common-Lisp-Starter-Package-Actually-Works.html"/>
  <id>http://quotenil.com/Planet-Wars-Common-Lisp-Starter-Package-Actually-Works.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-09-21T00:00:00+02:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
Released v0.6 (<a href="http://quotenil.com/git/?p=planet-wars.git;a=summary">git</a>, <a href="http://quotenil.com/binary/planet-wars/planet-wars-latest.tar.gz">latest tarball</a>). The way the server compiles lisp
submissions was fixed and this revealed a problem where MyBot.lisp
redirected <code>*STANDARD-OUTPUT*</code> to <code>*ERROR-OUTPUT*</code> causing the server
to think compilation failed.
</p>
    </div>
  </content>
</entry>


<entry>
  <title>Planet Wars Common Lisp Starter Package</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Planet-Wars-Common-Lisp-Starter-Package.html"/>
  <id>http://quotenil.com/Planet-Wars-Common-Lisp-Starter-Package.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-09-19T00:00:00+02:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
The <a href="http://ai-contest.com">Google AI Challange</a> is back with a new game that's supposed to be
much harder than Tron was this spring. The branching factor of the
game tree is enormous which only means that straight minimax is out of
question this time around. Whether some cleverness can bring the game
within reach of conventional algorithms remains to be seen.
</p>


<p>
Anyway, I'm adding <a href="http://quotenil.com/git/?p=planet-wars.git;a=summary">yet another starter package</a> (<a href="http://quotenil.com/binary/planet-wars/planet-wars-latest.tar.gz">latest tarball</a>) to the
<a href="http://aerique.blogspot.com/2010/09/planet-wars-common-lisp-start-package.html">lot</a>. It is based heavily on aerique's. Highlights compared to his
version:
</p>

<ul>
<li>no excessive use of specials (<code>*INPUT*</code>, <code>*FLEETS*</code>, etc)
</li>
<li>player class to support different types of players
</li>
<li><code>MyBot.lisp</code> split into several files
</li>
<li>it uses asdf (more convenient development)
</li>
<li>made it easier to run tests with executables (<code>./MyBot</code>) or when
  starting a fresh sbcl (<code>./bin/run-bot.sh</code>)</li>
</ul>


<p>
Proxy bot server:
</p>

<ul>
<li>can run compiled (<code>./ProxyBot</code>) or <code>./bin/run-proxy-bot.sh</code>
</li>
<li>started explicitly (no <code>:PWBOT-LOCAL</code> reader magic)
</li>
<li>can serve any number of proxy bots
</li>
<li>closes sockets properly</li>
</ul>


<p>
There is still a <a href="http://ai-contest.com/forum/viewtopic.php?f=18&t=421&start=40">problem</a> causing all lisp submissions to die on the
first turn no matter which starter package one uses which will
hopefully be resolved. Until then there is dhartmei's excellent
<a href="http://ai-contest.com/forum/viewtopic.php?f=18&t=424">unofficial tcp server</a>.
</p>
    </div>
  </content>
</entry>


<entry>
  <title>UCT</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/UCT.html"/>
  <id>http://quotenil.com/UCT.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-03-19T00:00:00+01:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
As promised my <a href="http://senseis.xmp.net/?UCT">UCT</a> implementation is released, albeit somewhat
belatedly. It's in <a href="http://cliki.net/Micmac">Micmac</a> v0.0.1, see test/test-uct.lisp for an
example. Now I only owe you Alpha-beta.
</p>
    </div>
  </content>
</entry>


<entry>
  <title>Google AI Challange 2010 Results</title>
  <link rel="alternate" type="text/html" href="http://quotenil.com/Google-AI-Challange-2010-Results.html"/>
  <id>http://quotenil.com/Google-AI-Challange-2010-Results.html</id>
  <updated>2011-02-27T15:13:58+01:00</updated>
  <published>2010-03-01T00:00:00+01:00</published>
  <author>
    <name>Gábor Melis</name>
    <uri>http://quotenil.com</uri>
    <email>mega@retes.hu</email>
  </author>
  <content type="xhtml" xml:lang="en" xml:base="http://quotenil.com/">
    <div xmlns="http://www.w3.org/1999/xhtml">

<p>
For what has been a fun ride the official results are now <a href="http://csclub.uwaterloo.ca/contest/rankings.php">available</a>.
In the end, 11th out of 700 is not too bad and it's the highest
ranking non-C++ entry by some margin.
</p>


<p>
I entered the contest a bit late with a rather specific approach in
mind: <a href="http://senseis.xmp.net/?UCT">UCT</a>, an algorithm from the Monte Carlo tree search family. It
has been rather successful in Go (and in Hex too, taking the crown
from <a href="http://six.retes.hu/">Six</a>). So with UCT in mind, to serve as a baseline I implemented a
quick <a href="http://en.wikipedia.org/wiki/Minimax">minimax</a> with a simple territory based evaluation function ...
that everyone else in the competition seems to have invented
independently. Trouble was looming because it was doing too well: with
looking ahead only one move (not even considering moves of the
opponent) it played a very nice positional game. That was the first
sign that constructing a good evaluation function may not be as hard
for Tron as it is for Go.
</p>


<p>
But with everyone else doing minimax, the plan was to keep silent and
Monte Carlo to victory. As with most plans, it didn't quite work out.
First, to my dismay, some contestants were attempting to do the same
and kept advertising it on #googleai, second it turned out that UCT is
not suited to the game of Tron. A totally random default policy kept
cornering itself in a big area faster than another player could hit
the wall at the end of a long dead end. That was worrisome, but
fixable. After days of experimentation I finally gave up on it
deciding that Tron is simply too tactical - or not fuzzy enough, if
you prefer - for MC to work really well.
</p>


<p>
Of course, it can be that the kind of default policies I tried were
biased (a sure thing), misguided and suboptimal. But then again, I was
not alone and watched the UCT based players struggle badly. In the
final standings the highest ranking one is jmcarthur in position 105.
One of them even implemented a number of different default policies
and switched between them randomly with little apparent success. Which
makes me think that including a virtual strategy selection move at
some points in the UCT search tree should be interesting, but I
digress.
</p>


<p>
So I went back to minimax, implemented <a href="http://en.wikipedia.org/wiki/Alpha-beta_pruning">Alpha-beta&nbsp;pruning</a> with
principal variation, and <a href="http://en.wikipedia.org/wiki/Iterative_deepening">iterative&nbsp;deepening</a>. It seemed to do
really well on the then current maps whose size was severely reduced
to 15x15 to control the load on the servers. Then I had an idea to
explore how the parities of squares in an area affect the longest path
possible which was quickly pointed out to me over lunch by a friend.
And those pesky competitors have also found and advertised it in the
contest forum. Bah.
</p>


<p>
There were only two days left at this point and I had to pull an all
nighter to finally implement a graph partitioning idea of mine that
unsurprisingly someone has described pretty closely in the forum. At
that point, I finally had the tool to improve the evaluation function
but neither much time or energy remained and I settled for using it
only in the end game where the players are separated.
</p>


<p>
The code itself is as ugly as exploratory code can be, but in the
coming days I'll factor the UCT and the Alpha-beta code out.
</p>
    </div>
  </content>
</entry>

</feed>
