I finally got around to installing CNTK. Having run the demos, I started thinking about what I could play around with as a project. A couple of years ago I gathered some cricket data for a different project. That data is on a backup drive somewhere. I didn’t really feel like looking for that today.
A quick search on Bing unearthed this gem: http://cricsheet.org. 11 years worth of ball-by-ball game data. That seemed like a good starting point. You might think that there’s not enough data there, but at a ball-by-ball level, it is a couple of million data points, enough to get started with. The data itself is in retrosheet format, originally used for baseball.
Not many results yet: most of the was spent writing the code to funnel data from the YAML format into something more useful, setting up a VSO site for the code, as well as watching the South Africa vs Sri Lanka test match.
As the code cleans up, and the results hopefully start rolling in, I’ll share them probably on Github or on this site somewhere.