Tuesday, September 23 2014 @ 00:00 +0200
Actually, I'll only link to the post-mortem I wrote in the forum.
There is a also a model description included in the git repo. A
stand-alone distribution with all library dependencies and an x86-64
linux precompiled binary is also available.
This has been the Kaggle competition that attracted the most
contestants so it feels really good to come out on top even though
there was an element of luck involved due to the choice of evaluation
metric and the amount of data available. The organizers did a great
job explaining the physics, why there is no more data, motivating the
choice of evaluation metric, and being prompt in communication in
I hope that the HEP guys will find this useful in their search for
more evidence of tau tau decay of the Higgs boson. Note that I didn't
go for the 'HEP meets ML Award' so training time is unnecessarily high
(one day with a GTX Titan GPU). By switching to single precision
floating point and a single neural network, training time could be
reduced to about 15 minutes with an expected drop in accuracy from
3.805 to about 3.750. Even with the bagging approach the code logs
out-of-bag estimates of the evaluation metric after training each
constituent model and the training process can be C-c'ed early.
Furthermore, the model can be run on a CPU with BLAS about 10 times
slower than on a Titan.