As the number of messages on twitter.mailana.com has approached 200 million, the response speed has been dropping. In fact, as @travisspencer observed, it's slow as sin. The back end is all MySQL, and I've spent a lot of time denormalizing my data, indexing and trying other optimizations to speed it up, but I've reached the point where just doing a simple SELECT * WHERE primarykey=something; can take up to a minute. I'm sure a real guru could wave a dead chicken over my table structure whilst murmering incantations and get the performance I need, but I've reached the point where it's easier to move to a simpler system with less moving parts.
That's where Tokyo Cabinet comes in. It's a much more primitive system than MySQL, just a key->value store rather than a relational database. That means I'll have to implement things like sorting and grouping in the client code, but my key requirement is that it fetch modest numbers of rows from massive datasets very quickly. The performance numbers quoted are very impressive, with millions of fetches and inserts possible a second, so I'm evaluating it now.
If you're interested in trying it yourself, first read this article on the background behind Tokyo Cabinet and Tyrant, and then go through the 30 slide introduction by the author. The main documentation is at http://tokyocabinet.sourceforge.net/spex-en.html and http://tokyocabinet.sourceforge.net/tyrantdoc/, and it's pretty good, though at first the number of functions and their naming is overwhelming (eg tcadbputkeep2!). They do talk you through the installation process, but it's a bit scattered, so here's the steps I went through to get it running on my Red Hat Fedora 8 system:
To set up the underlying Tokyo Cabinet database engine:
curl "http://tokyocabinet.sourceforge.net/tokyocabinet-1.4.9.tar.gz" > tokyocabinet-1.4.9.tar.gz
tar -xf tokyocabinet-1.4.9.tar
yum install zlib-devel
yum install bzip2-devel
make check (awesomely geeky text scrolls past for 10 mins)
make install (as root)
To install the Tokyo Tyrant server that provides remote access to the database:
curl "http://tokyocabinet.sourceforge.net/tyrantpkg/tokyotyrant-1.1.16.tar.gz" > tokyotyrant-1.1.16.tar.gz
tar -xf tokyotyrant-1.1.16.tar
ttserver (to check it installed ok)
Add '/usr/local/sbin/ttservctl start' to the end of the file
You should now have a server running. To test it out, push a value into the store and then retrieve it using the HTTP interface: