I spent the first half of my career chained to massive desktop machines, and I was so happy when I was finally able to completely switch to developing on my laptop. Once Amazon EC2 came along, and I could tap crazily-big servers on demand over a network, I never imagined I’d need to have a big box under my desk again. I’ve been surprised to discover there’s at least one niche that the cloud/laptop combination can’t fill though – heavy GPU computation.
I’ve found convolutional neural networks to be incredibly effective for image recognition, but teaching them can take days or weeks. It has become a major bottleneck in our development, and unfortunately Amazon’s GPU offerings are pretty sparse and expensive. The CG1 instances are based on 2010-era Nvidia chips, and so aren’t as speedy as they could be, and the G2 instances have newer GPUs, but are optimized for gaming and video applications rather than numeric processing. Since the CG1′s are $1,500 a month and slow, I was surprised to find it made sense to get an actual physical machine.
I spent some time researching what I needed and discovered that the official compute-focused Nvidia cards are painfully expensive. Happily some high-end consumer cards are known for giving excellent numerical performance for a lot less. The main difference is a lack of error-correction on their memory, and happily my calculations are robust to intermittent glitches.
I settled on the Nvidia Titan graphics card, and set out to find someone to build a machine around one. It’s been well over a decade since I built my last PC, so I knew I needed professional help. I settled on Xidax, they were well-reviewed and had a friendly process for setting up all the custom components I needed. The strangest part was that I found myself in an unfamiliar world of high-end gaming, with some impressive options for strange case lights and other gizmos I wasn’t expecting. I didn’t pick any of the custom effects, but you can see in the picture I still ended up with quite a pretty beast! During the build process they even emailed me progress photos, which was a neat touch.
They did a good job letting me know how the setup was going, and seemed to do a fair amount of pre-shipment testing, which was reassuring. The machine came within a week, and then the OS setup fun began. My first job was installing Ubuntu, since all of my software is Unix-based. I was hoping to keep Windows on there as an alternative, but the combination of partitions and a new-fangled EFI BIOS thwarted me. I currently have a machine that boots into the GRUB rescue prompt, and if I type exit, and then choose Ubuntu recovery mode, and pick choose ‘Continue as normal’ on the recovery screen, I can finally get into X windows. It’s not the most stable setup, and I’m hoping the new hard drive in the front of the picture will give me another drive to boot from, but it works! I’m able to run my training over twice as fast as I could on EC2, which makes a massive difference for development. I did have to do some wacky coolbits craziness to get the GPU fan to run above 55% speed, but with that in place processing cycles that took 35 seconds on the Amazon instances are down to only 14 seconds.