How to view the MAPI/RPC documentation online

No connection to the post for once, I just can't resist an angry lemur

Photo by Law Keven

Microsoft recently released the documentation for the secret protocol Outlook uses to communicate with Exchange. Yeay! Unfortunately they released it as a large number of PDFs in a zip file. Boo!

I've been using them for my work on Mailana, but having to use local file searching or manual browsing through all these documents rather than my usual web search has slowed me down. Today I finally bit the bullet, ran them through a PDF batch converter to get HTML, and put them online at http://web.mailana.com/exchangedocs

In a few days all that lovely information should show up in Google searches, and the site search should work too. Thanks to Darren Hoyt for the simple site-search HTML and PDF Bean for the conversion tools.

It's a shame that Microsoft's own documentation is so unfriendly to the web. They often change links without implementing forwarding, so often old blog or forum posts lead nowhere, and some documentation like this is only available as unsearchable downloads. Of course Apple can be worse, requiring logins before you can get at a lot of the resources, and the mailing list search tools are worthy of a geocities page ten years ago. It's funny how lost I feel when I'm researching an area that's invisible to Google, it really has become half of my brain.

What makes a great salesman?

Spiralslicer

Photo by The Life of Bryan

One of the things I really suck at is selling. Part of it is growing up in Britain with the belief it's about tricking people into buying things they don't need. Being a professional engineer didn't help either. Most programmers are simply baffled that customers don't simply get that their product is better. Why do they need some highly-paid guy in a suit to get involved?

As I got older, I realized that every job is a sales job. To get anything done, you need to persuade a whole bunch of internal and external people to help. Now I'm running a startup, and that's all about selling the idea to everyone I need to deal with; investors, business partners, employees and customers.

I've looked around for role models. My favorite so far is an infomercial host called Ron Popeil. I can imagine my British friends cringing because he's almost a caricature, but this profile by Malcolm Gladwell opened my mind to both how much dedication he has, and how effective he's been. So, what are his secrets?

Feedback and measurement

He's from a family with a long tradition of selling on street corners. At the end of an afternoon, they'd know exactly how much they'd brought in. That gave them a guide they could use to figure out what worked and what didn't. The infomercials followed the same principles, with real-time graphs showing how many people were calling and ordering.

This might sound obvious, but as Dim Bulb repeatedly demonstrates, most TV advertising is driven by the intangible idea of brand, with no idea what's actually working or failing. It's like the difference between the Greek philosophers building elaborate theories on how the universe works, and experimental science that's able to test ideas.

I've become a fanatic on trying to measure everything I can about my communications, keeping track of who I've talked to and when, measuring which pages and posts get visitors, using online ad experiments to gather survey data. Without that foundation, I'll never be able to improve.

Involvement in design

Ron actually built and designed his products in his kitchen. The goal was to create something that would sell itself. Too often, there's a distance between sales teams and the people who build the products. They might have some voice in the planning stages, but they're shut out during the implementation and expected to take whatever the result is and sell it.

Since I keep swapping hats between selling and building, you'd think I wouldn't have this problem. It's funny though, I often get caught up in the geeky coolness of the technology, and lose sight of what people are willing to pay for. The lesson I took from Ron's example here was to keep asking myself what problem every feature I'm working on is actually solving.

Product focus

In the infomercials, the camera quickly focuses on the gadget, and stays there. It's not about the personality of the salesman, it's all about what the device can do. There's an anecdote in Gladwell's story about a showdown between a few salesmen at a trade show. Frosty Wilson was charming and persuasive, everything you'd imagine a great salesman should be, but Ron and his partner both sold twice as much by making the product the star.

In my job, I've learnt it's best to cut to the demo as quickly as possible, and then let people try the prototype themselves. Nobody wants to sit and listen to a lecture, it's much more compelling to see what it can do rather than be told.

Fervent belief

Ron really, truly believed that the products he was selling would make his customers lives better. He sounds like Steve Jobs when he's hammering away at the smallest details of every design, making sure that everyone gets an experience they'll be delighted with. It's not just about getting their money, it's his purpose.

Luckily I am insanely convinced that what I'm building will change the way we work. I've found I'm most effective when I can just informally rant about all the amazing possibilities rather than sticking to a script.

Your real social network

Foxwhisperer

Photo by Law Keven

Stowe Boyd covered an interesting paper on social networks and concluded "the apparent, superficial social network based on following and followers conceals a deeper, sparser social network". Every current service has an incredibly primitive representation of your relationships. You're either friends with somebody, or you're not. Here's what''s missing:

Strength. There's no way to specify how close you are to somebody else.
Time. Is the friendship long-lasting? Have you talked recently?
Context. What other friends is this friend close to? Which circles do they move in?

It's well known that you can use communication data to answer these questions. This implicit approach is better than trying to get people to enter this information manually because:

Convenience. Nobody wants to spend time doing data entry and house-keeping on their network. Doing it automatically solves that problem.
Reliability. You can objectively measure how many emails somebody has sent you, and how many you've returned to them. This removes the subjective element that creeps in if you're asked to rate the strength of a relationship on an arbitrary scale. It also removes the temptation to exaggerate your closeness to someone influential.

So why hasn't anyone done this? There's massive technical barriers to overcome before you can access large stores of email, and big privacy issues. I'm convinced they can be overcome, and that's what I'm doing with Mailana. If you want to see the sort of detailed social graph I'm talking about, Boulder Twits is using the same backend as my email analysis system.

Why I love long pointless books

Eschercrossing

Photo by Regolare

I recently finished Infinite Jest. It's over a 1000 pages, and has no real arc or resolution, but I enjoyed it immensely. Before that I completed the 12 volumes of A Dance to the Music of Time, another sprawling epic without a conventional plot. Apart from literary masochism, or value for money (I picked up Jest second-hand for 50 cents), why read these monsters?

I realized I'm drawn to them because they feel a lot more real than most other fiction. The characters aren't driven to make decisions that the plot requires. Instead they're set loose on the stage, free to behave randomly, like people. That means you never reach a satisfying conclusion, but then my own life has never had clear-cut resolutions either.

I've also been on a Dickens streak recently, but that's mostly been motivated by the wonderful background characters that weave in and out of the stories. His protagonists and villains are clearly being maneuvered according to the author's plan, making them stilted and artificial. He's not constrained when he's sketching the unimportant people, so they can act like human beings. Little Nell is an alien, but I believed in The Marchioness and Dick Swiveller, despite their lack of purpose.

These books help remind me life's about the journey, not the destination. If you want to make part of your trip more pleasant, I'd recommend picking a long pointless book as a companion.

Get a personal map of your social network

I've upgraded Boulder Twits so that everyone listed has their own personal map, in addition to the graphs showing the whole community. They show who you talk to most on Twitter, organized into groups based on who they talk to. As an example, here's my personal graph. There's some clusters that represent different networks I'm in contact with:

Personalmapboulder

The Boulder folks are mostly in a small, tightly-connected pack on one side. It's almost a mini-version of the full community.

Personalmapapple

Only a few of my Apple colleagues are on Twitter, but they're all pretty interconnected too.

Personalmaplegal

Walter Olson is the founder of the Overlawyered legal blog, and as you can see both me and Jeff Nolan are big fans.

Stay tuned, I'll be using this data to answer some questions like "Who are my friends talking to that I should be following?".

True love and statistics

Mathematicallove

Photo by Keng

I ran an analysis of the most frequent correspondents in the Boulder Twits group, and was very happy to see Gwen Bell and Joel Longtine top of the charts. If you don't know their story, they met through Twitter, and will be getting married soon! It's a wonderful romance, and I was so pleased to see solid mathematical proof of their devotion to each other. My own dear Liz is a statistics major, so I know she'll appreciate it too!

Here's the full top 10, ordered by how many tweets were sent or received by each pair. There's a nice mix of friends and colleagues as well as couples:

  1. gwenbell and jlongtine: 222/291
  2. jennyjenjen and pugofwar: 148/210
  3. abatchelor and bfeld 113/115
  4. neogia and wittytwit: 91/137
  5. micah and technosailor: 82/59
  6. brianlburns and kohlmannj: 65/54
  7. heathercapri and wittytwit: 63/53
  8. micah and w1redone: 49/74
  9. briandewitt and jlongtine: 50/49
  10. ewu and jenn: 65/48

Congratulations to Gwen and Joel, long may their tweeting continue.

The D part of R&D

Mousetrap

Photo by Unloveable

Build a better mousetrap, and the world will beat a path to your door

That's completely wrong. If there's one thing I've learnt over my career, it's that technical excellence is just a small part of a product's success. Distribution is probably the most underrated ingredient, followed by a revenue model, marketing, financing and just plain good timing.

I started off in the UK, working in companies that were packed with insanely smart and resourceful engineers. There's a wonderful tradition over there of celebrating scientists and inventors, everything from the Faraday Christmas Lectures to Dambusters. That creates a big pool of people who can build widgets.

What was missing was the ability to turn a widget into a product. Selling things is a lot less prestigious than inventing them, with all sorts of class overtones of gentlemen scientists and grubby tradesmen mixed in. As a result, most of my companies produced wonderful code, but meager revenues.

Here in the US, I've been able to learn from people versed in the dark arts of actually building a company, not just a piece of software. To be honest it's a lot harder, computers are far more predictable than a gang of primates, but it's also amazing when you step back and see it starting to work. Taking an idea and turning it into something that sustains itself, a living breathing business, that's rewarding as hell.

I'm may not be there yet, but I'm having a blast as I shoot for it.

What analyzing digital communications misses

Closenessdiagram

Greg Berry just posted a very interesting comment, touching on a question I've wrestled with.

"…lots of business and life happens off the internet (hard to believe, I
know), but even within the digital confines, there are so many
different planes of communications to track."

Probably the best example of this is your significant other or business partner. If you're often in the same room as them, you probably won't send them as many emails as a direct report who's in another office. If you rely on communication frequency for measuring closeness, you'll underrate those relationships. So how do you work around this problem?

Design your algorithms around the blindspot. Google's search results are nowhere near as good as a dedicated human researcher could produce, but that doesn't matter. They narrow it down to a couple of dozen sites you can manually check. A few bogus results or dubious rankings don't matter because they can easily be spotted and ignored. The equivalent for tools based on automated relationship analysis is giving users the option to edit the strength of relationships to correct the occasional mistake, and always giving people a chance to eyeball any decision before any action is taken by the system.

Pick the right problem domain. I'm fascinated by applications in the business world because the relationships I needed the most help with are right in the sweet spot for email. I've sketched the graph above to show roughly the communication frequencies I've experienced. For different industries and generations the lines will shift and scale, but between Bob in accounting and your boss there's probably a lot of people you exchange a lot of mails with. Stick to problems related to those folks, and email frequency will be a good approximation to closeness.

Be realistic about the results. I think the Boulder Twits communication map is the best guide to the relationships in the local tech scene, but that's mostly because it's the only one. As Gregg says, different styles of communications heavily affect the results, even if you forget about the channels it's missing. Heavy Twitter users are far more likely to end up in the center of the graph than less prolific twits. Chris Wand is entirely missing because he's not on Twitter, even though he's heavily involved in the community. As we pull in more and more channels we'll be able to produce far better analysis, and do a lot of useful things, but we'll never capture all the fractal richness of relationships within our primate packs.

Javascript, the ginger-haired stepchild of the language family

Redhead

Photo by Gold Sardine

Liz asked me yesterday what language Mailana is written in. It took me a while to think about it, but the list is C (low-level Exchange interfacing), C++ (speed-critical string processing), C# (Outlook plugin), PHP (most of the server architecture), SQL (database querying), Actionscript (Flash components) and Javascript (rich Ajaxesque browser functionality). It got me wondering why the latter gets so little credit, out of all of them it's probably my favorite to use.

I found Douglas Crawford's explanations of why it's the world's most popular, and misunderstood language rang very true, but what really caught my eye were some demos written as pure scripts:

http://www.monstropolis.org/intro8.html
http://www.monstropolis.org/intro1.html
http://www.monstropolis.org/intro3.html
http://www.monstropolis.org/intro7.html
http://www.uselesspickles.com/triangles/demo.html (There's something deeply twisted about rendering 3D triangles using CSS style tricks, but I just can't look away)

I don't know when Javascript will be welcome in polite society, but dismiss it at your peril. It's now everywhere and there's a whole generation of self-taught programmers headed your way who know nothing else.

How I built the Boulder Twits graphs

Clockmechanism

Photo by Pierre J.

I knew I wanted to build a map of how people were connected in the Boulder tech scene. The first step was accessing the raw data, in this case all the Twitter messages from the first 60 local people I'd identified. I already had a system set up to rapidly analyze large numbers of email messages for my Mailana startup. It's modular, with different import components that access mail APIs like Exchange's MAPI/RPC, Gmail's IMAP and Outlook's Object Model, all outputting a stream of messages in standard XML form. Using Twitter's API it was pretty easy to build an importer. The only wrinkle was that I had to search for @someone in the message body, and add that to the recipients field in the XML.

That whirred away for a while pulling in the complete message histories into my database, with indices created keyed on the recipients, as well as lots of other values. Sitting on top of that database I've got a Facebook App-style REST API that let me run queries like "Tell me who sent messages to who within this group of people". Running that on the Twitter messages gave me a list that conceptually looked like this:

Alice to Bob : 10 messages sent, 3 messages received
Alice to Charles: 4 messages sent, 7 received

What I actually wanted was a single number for any relationship, a measure of how strongly Alice and Bob are connected. My choice was the lower of the sent or received counts, so in the above case

Alice to Bob: Strength 3
Alice to Charles: Strength 4

I like this method for mail because it excludes bots like Facebook notification addresses that you never reply to, and penalises other sort of unequal relationships, eg ignoring famous people you might have emailed who ignore you. Not that that ever happens to me of course.

So now I had a list of all the relationships in the community, I needed to display them. I wanted something that could be interacted with inside the browser, so I built a Flash component. I'd never written any Actionscript before, but Mark Shepherd's Springgraph example was a great starting point. After a few days of wrestling with the wonders of flex I had something working.

I then wrote a PHP script that accessed the Mailana API to produce the link information, and the output it in an XML form my component could read in. I based it on the format Daniel Mclaren used for his handy Constellation Roamer plugin, since I'd used that before.

For the Boulder Twits site I didn't want to re-run the query every time to generate the XML. Though it only takes a fraction of a second to create, the system's still pre-alpha so I didn't want a production site depending on it. Instead I saved off several versions and pointed the component directly at the cached XML files. I also didn't want to require every viewer to rerun the force-directed layout, so I let each version arrange itself on my machine, saved the positions and paused the simulation by default. If you want to see the simulation running, try clicking the small play icon in the top left and drag a few people around to see the graph compensate.

I had a lot of fun putting this together. To be honest I was looking for a nice cozy code-womb to crawl into for a couple of weeks after draining my extrovert batteries through Defrag and lots of followup travel and meetings. This was just the ticket, now I'm recharged and looking forward to meeting all the people I've discovered through compiling the list!