“Lockdown in Sector 4 (Failure)” – Why relying on Google’s IMAP API is scary

Alertlight
Photo by Illetires

I’ve been downloading email from my Google account onto a local machine to do data analysis, using their IMAP interface. About an hour ago, I suddenly started getting failed connections with the message ‘Lockdown in sector 4 (Failure)’. After a bit of googling, I discovered that’s their whimsical way to indicate that they’ve detected something in your access patterns they dislike, and IMAP access to that account is dead for up to a day.

I’m assuming the algorithm that triggers this is related to the amount of downloads you attempt in a given period of time, and they’re trying to prevent overloading of their servers or hacking attempts. It’s unfortunate that the criteria they use are unpublished, my usage today seemed reasonable, and no different from other times I’ve done something similar.

I’d expect Google to have mechanisms like this in place to protect themselves but it highlights what a scary prospect basing a business on IMAP access is. Any service like Xoopit could trip over Google’s hidden rules, and even if they’ve reverse-engineered their current state, there’s no guarantee that the triggers won’t change. And that’s just assuming that Google don’t want to be evil, if they decide that they don’t like their users handing over their passwords to third-parties, it would be easy enough to block their services in just the same way.

The real answer would be a proper gmail interface for third-party applications, rather than reusing the IMAP transport protocol with its reliance on sharing passwords, but this doesn’t seem like a high priority for Google. Until then, I’m going to keep my main focus on the Exchange world. The APIs may be scary, but at least Microsoft offers unfettered access to the data once you’re installed as an add-on.

Are you building a Leatherman or a Samurai sword?

Samurai
Photo by Simonella_Virus

One of the classic engineer mistakes is adding features to improve the product. Under the strict eye of the designers at Apple, I had to learn that every increase in complexity decreases usability. Greg Niles, the designer on most of the products I worked on, became my test. I’d imagine "What would Greg say?" as I was working out the technical interface, and usually my first cut would produce "That looks like a space shuttle control panel", or "Um, what does ‘epsilon’ mean?’ The next step would be ruthlessly removing everything but one or two vital UI elements.

This goes against the engineering grain, because you’re setting limits on what the user can do, when you know the underlying code has all sorts of potential knobs they might like to tweak. It’s also extremely hard to know what you can remove, because you have to bring exactly what the user wants into sharp focus. You can’t just throw a bunch of building blocks at them and expect them to put something together that solves their problem. Instead you have to get inside their head, and deal with their frustratingly messy and contradictory requirements.

I was thinking about this today as I read through the incredibly useful Monitor101 post-mortem (via just about everyone), as well as the GameClay retrospective, and remembering Nick’s look at why Disruptor Monkey failed. Every technical founder wants to build a platform, they want to solve lots of problems all at once in a very general way. I like Roger’s term ‘Science Project’ because it’s the urge to build something that all your fellow hackers will stand around and gaze in awe at. You end up with a million features, which makes it very time-consuming to build, and even when it’s done, the number of different gizmos on your Leatherman scare off potential users. You need to have a strong connection to your actual customers, and be hearing about exactly what they need to do. Then you need to design around that, ruthlessly jettisoning anything that distracts from them achieving their goals.

Samurais may not be able to get stones out of horses’ hooves with the sword you come up with, but they should be very effective at chopping people’s heads off.

Kurt’s Catechism

Questions
Photo by Chotda

I had dinner with Kurt Loheit last night, and as always I left with a lot of mental cogs whirring. I met him years ago through trailwork, his pioneering efforts building bike trails earnt him a place in the mountain bike hall of fame. He’s also an actual rocket scientist, and has been an internal entrepreneur at a large aeronautics firm for the last 30 years. When we sat down to discuss my new venture I knew he’d have a lot of insights.

He spends a lot of time trying to evaluate engineers’ ideas for funding, working almost as an internal VC, and he shared one of his essential tools with me. It’s a very simple idea, a series of basic questions based on Heilmeier’s Catechism. Answering them forces someone to break their technical focus and think about the bigger picture. The real value is that the answers form a pretty strong overview of the strengths and weaknesses of your idea, so you can evaluate your own proposals before you try and pitch them to a gatekeeper like a VC or someone internal who controls resources. Their simplicity and the structure makes it a lot harder to get lost in the details of your idea, which is always my temptation as an engineer.

  • What are you trying to do?
  • How does this get done at present and by whom?
  • What are the limitations of the present approaches and what are the metrics?
  • What is new about our approach?
  • What is the key technology or concept that now makes this idea work?
  • Why do we think we can be successful and what are your key assumptions about the external environment?
  • Who is the customer and what is the market for the idea?
  • If we succeed, what difference do we think it will make?
  • How long do we think it will take to develop proof of principle?
  • What are our midterm and final exams to see how we are doing?
  • How much do we think it will cost to develop the concept through proof of principle?
  • (For new ideas before a research project has begun: How much do we think it will cost and how long to develop a seedling (i.e. understand idea, metrics and develop concept briefing).

How to convert Microsoft Word, Excel and PDF files to HTML or text in PHP

Metamorphosis
Photo by Liyu15

I need to analyze and display documents attached to emails, and that means converting from common formats like .doc, .xls and .pdf to either plain text or HTML. Thankfully there’s several different command-line tools on Linux that do a pretty good job, and then all you need is a bit of PHP duct tape to build your own online document converter, a poor man’s version of Zamzar. Here’s an example running on my test server, with the source code available here. To use it, select an example Word, Excel or PDF document, choose whether you want pretty HTML or processable text, and click Convert File.

If you want to get it running on your own system, here’s the directions for Red Hat Fedora Linux, though with some tweaking of the installation steps it should work on most Unices.

First install the tools by running the commands in bold:

yum install w3mThis gets you the text-based w3m web browser, useful for converting HTML to text
yum install wvThe wvWare package that can convert MS Word .doc files
yum install xlhtml xlhtml converts Excel files to html
yum install poppler-utilshandles PDF files
yum install ghostscriptneeded for high-quality rendering of PDF files

Once they’re in place, you should just be able to copy over the two php files to a folder on your server and get the example running. The rendering isn’t perfect, in particular the PDF handling has been very problematic, I had to disable all image rendering and it defaults to a horrid grey background. This might be an issue with using poppler rather than xpdf, so if pretty PDFs are important you might want to experiment with that instead. I’ve also seen some glitches with the spreadsheet rendering, but overall I’ve been very impressed with the results from wvWare and xlhtml. I was also hoping to handle PowerPoint .ppt files, but xlhtml fails with a ‘xlhtml: cole – OLE2 object not found’ error which I haven’t had a chance to debug yet.

Thanks to Phillip Hollenback for his original article covering using some of these tools within a mail program, he had some great tips on how to wrestle them into a pipeline.

Park rangers hate you

Goatwarning
Photo by GordMcKenna

I know a lot of park rangers, and they have a really tough job. The pay’s rotten, they spend most of their time picking trash out of pit toilets, acting as cops when people bring their problems with them to the park (most commonly domestic violence and DUI at the campgrounds), and they are at the mercy of a giant bureaucracy. Things we be a lot easier if there weren’t all these damn park visitors getting in the way.

Anyone who’s dedicated to preserving the outdoors would find it a lot easier if it wasn’t for the pesky general public. They pick flowers, drop litter, make noise, cut trails and generally damage the environment. It would be so much simpler to take care of the wilderness if people would stay out.

This means that increasing the number of park visitors is pushed way down the priority list. In fact, while the idea might be paid lip service, any measures that might help usually conflict with other things considered more important. Charging an entrance fee has to be a big psychological barrier, but it’s pretty popular with rangers because it means that they have an excuse to ask for a receipt from anyone causing trouble, and either search or eject them if they didn’t keep it. Publicizing or improving back-country campgrounds to encourage visitors means a lot more maintenance and enforcement work.

Our parks systems have ended up working like a monopoly, where customers are a hindrance, not a priority. Individual rangers are dedicated to encouraging everyone to share their love of the outdoors, but all of their incentives push them away from acting to pull in more people. Environmental organizations are so focused on preservation, they fight against even low-impact recreation. The 1997 Merced flood halved the number of camping spots in Yosemite, but there’s still a battle to build any replacements at all.

This is all part of a slow crisis, where park attendance across the country is dropping overall, and particularly in California, despite yearly fluctuations. This matters because parks all require government money, which means they need popular support. Why should people pay for something they’re never likely to use? During the California budget crisis, the governor planned to close many state parks like Topanga. That was hardly mentioned on the news, and though it was prevented for now, you can bet the lack of a public outcry will affect politician’s calculations in the future.

What can we do about all this? I think there has to be a grass-roots effort to let people know what’s available, reignite their interest and boost attendance. I’m trying to do my bit by documenting local camping, most of which is not covered by the agencies websites. I’m also trying to be a voice for more low-impact recreation at organizations I’m involved in like the Santa Monica Mountains Trails Council.

What does the cloud mean for email?

Nimbus
Photo by David AG Wilson

There are two big reasons email hasn’t been evolving like the web; the data’s a lot harder to get hold of and it’s really hard to crunch it once you have it. The web relies on the cloud to solve the second part, and I’m convinced that email will need something similar to move forward.

Almost all the exciting tools in the email world are client plugins, because that’s the easiest and most secure place to grab the data. The big drawback is that client CPU cycles and disk space are scarce resources. You can get a 1 terabyte disk for less than $200, but any client application that used more than a few hundred megabytes would be considered ill-behaved, even though the monetary cost of that space is currently just a few cents. This is because you can’t rely on that space being available, it may be an older machine, a lap-top, full of other data files, or a million other reasons that make relying on large client disk usage unpopular with users. The same holds true for CPU cycles, anything that slows down Outlook or increases the risk of a crash will be shunned.

Cloud computing makes it possible to take advantage of cheap storage to improve the user experience. For example, take a heavy email user and assume that an average message contains 10,000 characters, she gets 1,000 a day, and there’s about 2,000 days of email in her account. That’s around 20 GB of storage, or $4.00 worth. Imagine creating a Google-style index of every word in that email, so she can instantly search it all. Even if that quadrupled the storage size to 80 Gb, that’s still only $20 of storage, with a massive user benefit.

So, if that’s all true, what’s stopping a flood of startups taking advantage of this? In the consumer world Xoopit is doing great work, but they’re having to ask people for their gmail passwords since there’s no other way of grabbing the data. Without a more official API, it’s a pretty scary proposition to build a business around. On the enterprise side, there’s an almost complete lack of overlap between the people who know how to interface with Exchange, and those who want to do crazy new startups.

What’s the answer then? That’s what I’m working on, so watch this space.

Camping on Anacapa Island

Lizladder

Me, Liz, and two friends spent Saturday night camping on Anacapa Island, just off the LA coast. It’s a small place, and I was pretty nervous we’d be bored out of our minds. When our friends phoned to book their boat trip with Island Packers, the lady taking their order was incredulous that they wanted to camp there; "Have you been before? People don’t go back there twice." There’s less than a mile of trails over the whole island, no trees or shrubs, just thousands of seagulls nesting, with all the noise and mess you’d expect.

Seagulls

The boat ride across took less than an hour, with the local dolphins putting on a great show. As we got closer, we could see the sheer cliffs that completely surround the island, and the famous rock arch on the tip.

Archrock

The trails may be short, but getting onto the island from the boat is a workout, with 150 stairs from the dock straight up the cliff. There’s no water at the campground, so our rucksacks were heavy as we packed all we needed up the iron stairwell.

Stairs

After a quick talk from the ranger, we headed the half-mile to the campground. There were no assigned spots, though you do need to make reservations ahead of time. There’s only 7 spots in the center of the island. Be careful, there’s a couple of places with numbered posts on the way that look like camping spots, but we saw rangers move two different groups on from them while we were there. Here’s the maps and descriptions of the different spots. We took number 7, without realizing it was designed for a single tent. We fit both of our 2-man tents in, but it was very cosy.

Camptext
Campmap

Campground

There isn’t much privacy at the campground, since there’s no shelter, but spots 6 and 7 were a little bit tucked away. They were also close to the cliff-top overlooking a view down the whole island, and into the sea below.

Clifftop

Once we’d got our tents set up and settled in, the moment of truth arrived and we had to find something to do. The loop around the clifftops kept leading to some amazing overlooks, so we spent several hours staring down through the crystal-clear ocean, watching the sealions doing loops around the scuba divers exploring the reefs and kelp forests. We were al jealous of the divers, I could see why Jacques Cousteau considered the Channel Islands the best temperate diving in the world. We wanted to take a swim too, but with sheer cliffs all around, it seemed impossible. Luckily Liz came to the rescue, and figured out we could take a dip off the dock. You can see her sliding into the water at the top of the post. You’re allowed to swim here, but you’ll need to be very careful since there’s not much margin for error. The conditions were perfect for the four of us, with 64 degree water, flat as a lake with amazing visibility.

Anacapaswim

After the swim, we were ready for bed. The ranger had warned us that the gulls and foghorn would keep us awake, but we slept well until the dawn chorus kicked in. Despite all my misgivings, it turned out to be a great trip for all of us. The best word for the place is ‘wild’, you know you’re on the edge of the world, but there’s so much life all around you. I wouldn’t want to live on Anacapa, but it’s a great place for an adventure.