PostPath’s drop-in replacement to Exchange

Photo by Rogilde

Everyone knows there has to be a better hub for your mail than the current Exchange, but you can’t adopt a system like Zimbra or Open Exchange without making changes to all your desktop and mobile clients. (corrected- Gray pointed out that ActiveSync means no mobile headache!) So I was excited to see a server solution that aims to leave the rest of your communications world untouched, PostPath. They emulate a whole series of Microsoft’s proprietary communication APIs, and they’ve done it the hard way, sniffing network packets since they were tackling this before the latest Open Specification releases.

They seem to have done an impressive job emulating Exchange, with a good crop of real-world deployments showing that it’s at production quality. There’s some things I’d like to figure out, like how much of Exchange you need to keep around for Active Directory and management tools, but I’m looking forward to downloading the evaluation version and seeing for myself.

Should startups care about security?

Photo by MSH*

Is worrying about security early in your startup just like worrying about scaling, a distraction that will eat up valuable time and increase the chances you’ll fail? That’s something that’s on my mind as I watch the looming issue of Facebook App security. Once there’s a richer set of targets like the Paypal App, there will be a lot more malicious people trying to exploit any holes, and it’s practically impossible to prevent cross-site scripting. It feels like the period when every engineer knew that Windows was horribly insecure, but there hadn’t been enough of a user impact for anyone to care.

That analogy is interesting because Microsoft crushed the competition for over a decade, thanks in part to their fast development process, enabled by reusing old, insecure components as a foundation. It’s a classic worse-is-better scenario, where the unobserved lack of security meant less to the customers than improved features. The very long-term outcome wasn’t so good, the lack of security mauled their reputation and opened the door to a lot more competitors, but their strategy still created an immense amount of value.

If you could go back in time to the early 90’s, I think it would have been possible to avoid a lot of the security holes with some comparatively simple changes to the code that was written then. From the 386 onwards, there was enough processor support to start partitioning user level code from the OS, but there was never a strictly enforced model for security.

I’ve tried to learn from that in my own work. Security planning can easily turn into a tar-pit of architecture astronautics, but it is possible to have some simple principles that don’t get in the way. Most of the exploits that The Harmony Guy and others uncover with Facebook could be fixed if every operation required an authentication token, like a session ID. Make sure you escape all your user input before including it in an SQL query. Drop a feature or technology if there’s a high security risk. There’s no such thing as absolute security, but a little bit of paranoia at the outset will go a long way to safeguarding your customer’s information. Know what the vulnerable areas outside your control are, and make sure they’re on a list somewhere, for once you’re rich and famous enough to get something done about them.

Now Facebook’s in that position, I really hope they’re lobbying hard for a secure foundation for browser-based apps. For example, an expanded and standardized version of the IE-only "security=’restricted’" attribute could prevent a script in one element from touching anything outside itself in the document. They’re trying to build a sandbox through script-scrubbing, but the only sure-fire way to do that is within the browser. They have a window now before they start suffering from bad publicity, I hope they’re able to use it.

LA’s secret nuclear meltdown

Photo by Michael Helleman

The world’s first nuclear meltdown happened 30 miles from downtown Los Angeles, and released hundreds of times as much radiation as Three Mile Island. And I’m betting you’ve never heard of it.

I was at the SMMTC board meeting on Thursday night, and two of the parks representatives were arguing about whether Runkle Canyon was owned by the National Park Service or another agency. I pulled out my iPhone to check it out on Google, but was surprised to see that most of the links mentioned a nuclear disaster. I’ve lived in Simi Valley for 5 years, Runkle Canyon is only a few miles from my house, and that was news to me.

Digging in deeper, I discovered that the world’s first commercial nuclear reactor was opened at Rocketdyne’s Santa Susana Laboratory in 1957, powering 1100 homes in nearby Moorpark. As an experimental facility, it had no concrete containment shell, and it was using the highly reactive element sodium as a cooling agent, rather than water. In 1959, the cooling system failed, 13 out of 43 fuel rods melted, and a large amount of radioactive gas was leaked into the air. No measurements were taken at the time, but the Santa Susana Field Laboratory Advisory Panel report estimates that the total radiation released could have been up to 500 times that of Three Mile Island.

For 20 years the accident was kept secret, with a small report stating that only one fuel rod had melted and no radiation was released. In 1979 a UCLA professor uncovered documents showing the true extent of the accident, and since then there’s been a struggle to reconstruct exactly how much contamination there was, and how to clean it up. Home developers were recently been pushing to buy the site from Boeing and build a residential housing development! Luckily there was a recent agreement to keep the area as open space as a new state park.

I’m still happy here in Simi Valley, but now I’ll be keeping a careful count to catch any newly sprouted fingers or toes. For more information on the accident itself, check out this History Channel excerpt:

How to improve Gmail

Photo by Michael Bonnet Jr

I’m a big fan of Google Mail, I’ve moved most of my family over to it, and use it for my own accounts. They offer a lot of tools for searching and filtering your email, their browser interface is top-notch with advanced support for things like hot-keys, and they support APIs like IMAP so you can easily connect to non-web devices. I’m eagerly anticipating the day that they apply the same smarts they use to process web data to all of that information in my inbox, but the service seems to have stood still since it was launched.

My hopes were raised when I saw the launch of Gmail Labs, but I was disappointed when I looked through the experimental tools available. They were all fairly minor UI tweaks, things like removing the chat sidebar or the unread items counts. I was looking forward to seeing some funky analytical magic, things like Xoopit’s innovative attachment display, or extracting your social network from your mail history.

I’m not sure why Google is being so slow to innovate with Gmail. Part of it may be technical, doing those sort of analytics requires a lot of database work, and that may be too resource-intensive and scary for the spare-time Labs model to produce results. They may be worried about the negative effects on their reputation if they’re seen to be data-mining peoples emails too. It makes sense for them to focus on attracting users to their service. If anything like Xoopit does become popular, they can imitate and rely on gathering a similar customer base to be a big barrier to any smaller competitors. Hotmail and Yahoo could pose a threat thanks to their larger user count, but they seem even less likely to do something radical and new.

To move forward, I agree with Marshall Kirkpatrick that Gmail should offer an API for email content, one that doesn’t require users to hand over their passwords like IMAP. Imagine all the fun stuff that a Facebook-style plugin API could offer to mail users, operating securely within a Google sandbox to limit the malicious possiblities. If the reputation risks of that are too scary, they could make progress with an internal push to do something similar, encouraging their mail developers to move beyond incremental improvements and really sink their teeth into some red meat innovation.

I think the biggest barrier is the perception of email as boring, which leads to few resources being devoted to it, which leads to few innovations, which makes it appear boring. Hopefully services like Xoopit and experiments like Mail Trends will break that cycle by opening people’s eyes to the possibilities.

Who owns implicit data?

Photo by Kurtz

Barney Moran recently posted a comment expressing his concerns on Lijit’s use of the statistical data it could gather from blogs that installed its widget. I checked out Lijit’s privacy policy and it’s pretty much what I expected; they’ll guard any personal information carefully, but reserve the right to do what they want with anonymous information on usage patterns. They’ve also pledged to stick to the Attention Trust’s standards for that usage data.

Barney is organizing a publishers union of bloggers, and he seems not so much concerned about privacy as he is about who’s profiting from the data. I’m biased since I’m a developer building tools to do interesting things with implicit data, but I assumed the trade I’m making when I install a widget is that the developers will get some value from the statistics, and I’ll get some extra functionality for my blog. Since they’re not putting ads in their content, the only other revenue stream I can see is aggregating my data with lots of other peoples, generating useful information and selling it on. This doesn’t hurt me, I’m not losing anything, so it feels like a win-win situation.

Josh Kopelman captured this idea in Converting data exhaust into data value. There’s a lot of ‘wasted’ data flying around the web that nobody’s doing anything with, it seems like progress to capture some of it and turn it into insights. The trouble is I would say that. I’m a wide-eyed optimist who’s excited about the positive possibilities. Barney represents a strand of thinking that we’re not really running into yet, because no implicit data gathering service has a high enough profile to register with the public at large. I’d expect there will be more people asking questions as we get more successful and better known.

So far most of the bad publicity in this area has come when people feel their privacy has been violated, as with Facebook’s Beacon program, but the ownership question hasn’t really come up. After all we expect big companies like Google and Amazon to make use of their usage statistics to do things like offer recommendations, why should an external service be more controversial? I’m not sure it will turn out to be an issue, but the Beacon controversy shows we need to have an answer ready on who we think owns that information, and explain the bargain that people are making when they use our services.

Animate your social graph with SoNIA


I was lucky enough to get some time with Professor Peter Mucha of UNC this week. He’s a goldmine of information on the academic side of network visualization and analysis, and one of the projects he clued me into was SoNIA.

One of the most exciting areas for visualization is animating over time. It’s an incredibly powerful way to demonstrate changes in an easy to understand way, but it’s also very hard to do. Building the tools is tough because dealing with time massively multiplies the amount of computation you need to do, and is a very tricky user-interface challenge too. SoNIA is an ambitious attempt to provide an open-source professional tool for animating network data.

It’s sponsored by Stanford, and developed by Skye Bender-deMoll and Dan McFarland. It’s designed to take data files that describe graphs at different states in time, and give you the control to lay out and animate those networks. It’s already been used for an impressive series of studies, you should check out the included movies there if you want to get an idea of what it’s capable of. One of the best known is the study illustrated above used the software to demonstrate how your social network and obesity were correlated.

It’s freely downloadable if you want to give it a try yourself, and I’d recommend starting with Dan’s quick start guide to understand how to use it. It offers a lot of control over the underlying algorithms, but don’t be daunted by the space shuttle control-panel style of the UI, it’s possible to create some interesting results using many of the default settings. I’m looking forward to applying this to some of the communication data I’m generating from email, animation is a great way to present the time data inherent in that information.

What does the Anglo-Saxon Chronicle mean for email?

Photo by Vlasta2

Towards the end of the Dark Ages in England, monks collected the known history of the world into a manuscript, and then updated it with short notes every year. It’s hard to truly know what the motivations of the people who wrote the Anglo-Saxon Chronicle were, but it’s fascinating to read through and think about what drove them to write it.

There’s a strong human urge to write about your world for an audience, to make a connection with other people, to understand it better by organizing your thoughts on paper, to grab a little bit of permanence in posterity, to influence events and to spread the word about good or bad news. It’s always been a minority activity, but as literacy and free time spread, more and more people kept first diaries, then blogs and other time-based chronicles like Twitter or Tumblr.

Only a tiny fraction of people online keep a blog or tweet, but almost everyone creates content that would attract some audience if it was shared, it’s just locked in emails. The writers of the Chronicle had to overcome massive obstacles to see their work distributed, now we’ve got a massive selection of free and easy services to do the job, so why is there comparatively little take-up? Life-streaming services like Friendfeed and Facebook’s own feed are the closest to mass-market self-publishing we’ve got, but even those don’t have much of what we write every day.

Part of the reason is the comfort of privacy. Emails can go to a small trusted set of people, and you can have confidence that your frankness won’t come back to haunt you. Blogs are the other end of the spectrum, absolutely anybody can see what you’re saying. Social networks and services like Twitter are somewhere in-between, with the idea of a limited set of friends who can see your content, but without the fine-grained control of email.

I have a vision of being able to right-click in your inbox, and publish the content of a message, either to one of several groups you’ve set up (close work colleagues, the whole company, friends), or the original recipients, or to the whole world. A lot of my email could be shared, there’s technical explanations, status updates and interesting discussion threads that would be safe and useful to make available. Imagine a company where that sort of publication internally was routine, you’d have a valuable resource to search for solutions for so many problems. The really appealing part for me is that its not requiring anyone to change their routine, they’ve already got that content, they just need to unlock it.

The results sure wouldn’t be as polished or organized as most blog posts, but getting a lot more people publishing by lowering the barrier to entry would unlock so much juicy information that’s currently gathering dust. People have shown that they’re a lot more willing to post on web forums and comment on blogs than they are to create their own formal posts. I think the future has to be in gathering together all those fragments into a centralized identity (something folks like Intense Debate and Lijit have recognized) but what’s missing is any way to make email content be a part of that conglomeration.

Easy user authentication for Windows with PHP

Photo by Richard Parmiter

The internet is slowly groping towards a single user identity system through the OpenID initiative, but one of the nice things about working inside a corporate firewall is that there’s already a directory of user names and passwords. In the dominant Microsoft world, you rely on Active Directory to keep track of all that information. The ‘Active’ prefix usually strikes fear into anyone integrating non-MS technology into a Windows world, since it often translates to ‘proprietary’, but they’ve actually done a really good job of making the directory information available through the LDAP open standard.

If you want to try converting your PHP-based internet app to intranet authentication, check out this tutorial on using LDAP from PHP with an Exchange server. If you’re interested in the details of using LDAP with PHP in general, things like how to install the LDAP module if it isn’t there by default on your PHP installation, check out this two-part guide.

Get a real-time view of the Apache error log

Photo by CR

I spend a lot of time developing on a remote server through an SSH connection, and I’ve found it tough to keep an eye on the error log file. Typically I’ve been running the tail unix command to look at the last 10 lines, but this only gives you a snapshot of the errors at that instant. If I wanted to see more, I had to run tail again. I knew there had to be a better way, something I was missing since I couldn’t imagine unix wizards putting up with this. Luckily I was right. If you pass the -f or –follow option to tail it will continuously update, so you can see errors in real-time as the lines are written to the log file. This is perfect, I can see at a glance what’s going on.

To do the same, just open an SSH session to your remote server, and then type in the following command:

tail -f /var/log/httpd/error_log

The log file location varies on different flavors of Linux, and if you have access problems, make sure the logged-in user has high enough permissions to see it.

How to find new ways to visualize information


It’s hard to imagine a more codified visualization than a calendar. You have a grid of cells, with each row representing a week, and that’s pretty much it. That’s what my good friend Kent Oberheu set out to change. Over the past 5 years he’s produced 60 monthly calendars that show time contorting in strange paths, all embedded in his art. The regularity of a traditional calendar is lost, instead you really see the flow of time. It helps me remember that every day is unique, just like his visualizations.

I don’t think these are going to replace standard calendars, but it’s a demonstration that looking at even the most mundane information in unusual ways can start you thinking in new directions. Kent’s better known as Semafore in the design world, and after years as a designer with Apple he’s just moved to a new position at Industrial Light and Magic. All the time I’ve known him he’s been pushing into new visual territory, finding magic combinations that tickle your aesthetic nerve and manage to get something across at the same time. He’s the first person I turn to when I need to talk about designing visualizations, and the combination of my engineering skills and his design direction has worked very well.

As I’ve gone deeper into the subterranean world of information that’s held on companies’ Exchange servers, it’s obvious that part of the toolkit for navigating has to be new ways of looking at that data. Some of that can be borrowed from the recent explosion of web visualizations, such as animated tag clouds for your email content, but some of the challenges are unique. How can you do a decent view of different versions of your attachments over time for example?

That’s where the Information Aesthetics blog comes in. It was the first place Kent sent me when I started to ask him for inspiration, and it’s full of the latest and most beautiful visualizations. Some of my recent finds from there include wordle, a Java applet that lets you create very good-looking word clouds, the Information Design Patterns Cookbook and the Mount Fear physical sculptures representing London crime statistics. I don’t know if I’m going to cut up cardboard to build an art piece from the Enron emails, but all of this starts trains of thought on how to adapt these innovations to my problem space. If you’re looking for inspiration too, I’d highly recommend subscribing.