The thing I suck at most…

Sos

Photo by viernullvier

…is asking for help. Or to be more specific, exposing my ignorance and uncertainty when I ask for help. I've no problem asking for programming advice because I'm confident in my engineering knowledge. The business world is something else. I started my own company because I wanted to stretch myself in a completely new area, and I'm learning as I go. The trouble is, I have to project confidence to persuade people I'm on a mission worth supporting. How can I do that and ask for help when I need it?

So far I'm solving that by opening up to a few confidants and staying on-message with the rest of the world. My dilemma is there's lot of people can offer help but also need to see me showing confidence in my approach.

So what's the solution? I'm going to be more open about the questions I'm struggling with, and focus on learning from other people's experience rather than just trying everything for myself. Luckily there's a golden opportunity for me to do just that on the horizon- stay tuned for more details once I can share them…

Why statistics are both powerful and dangerous

Tapemeasure

Photo by Lite

Having dinner with a friend last night, one of the topics that came up was our shared obsession with statistics. It reminded me how looking at the wrong numbers has tripped me up. As an old teacher told me "You start off measuring what you value, and end up valuing what you measure".

As a concrete example, I have no idea how much I weigh. I stopped looking at scales years ago because  watching that number fluctuate day to day made me stressed out and demotivated. What I do watch is whether I can fit into my pants! That skips the stress of worrying about a few pounds, but warns me if I'm drifting from my normal range.

Another topical example is the stock market. I don't believe in stock-picking, so I'm purely invested with a low-fee S&P 500 index fund. I desperately try to avoid seeing the day-to-day level, because my primate brain will kick into flight mode when it drops and I'll be tempted to sell. As Behavior Gap explains very well, that's why individual investors sell low and buy high, getting dramatically worse returns than institutions. Again, that rapidly fluctuating number isn't really what I care about – it's how much I'll get years from now when I sell.

Once you start tracking a number, it becomes a priority. I'm fanatical about measuring user engagement with Mailana, because improving people's experience with the service is the only way I can make progress. I have a daily list of statistics for visits, bugs reported and twitter mentions, and seeing that every morning really helps me focus on what's important. These are good metrics because I actually care about exactly what they're measuring.

Another quote I remember is "Give me your daily routine and I'll tell you what your priorities are" : what you spend most time on is what you value most. It's the same for measurements; decide what's important to you and then pick the statistics. Don't just pick what's convenient or you'll find yourself making decisions that improve the numbers but destroy your business.

How to include real-time Twitter comments on your site

I'm a heavy user of Shannon Whitley's SPIURL public service that offers easy links to Twitter portraits, so I was very interested to hear about a new project, Real-Time Chatterbox. It's a website widget that lists the latest Twitter comments that mention a keyword you specify. This is a great way of showing social proof to potential users if you search on your company name. To help weed out offensive comment Shannon lets you apply a black list of foul language, and so far I've been very happy with the results.

I've added this to the front page of twitter.mailana.com, and if you're interested in a free and easy way to engage your users, I highly recommend you check it out too.

Mailana demo at Denver NewTech Meetup

I’m really looking forward to the NewTech meetup on Monday. I’ve got a 5 minute presentation slot, and as part of my rehearsal routine I like to create a video version. Here’s my latest draft, as you can see I’m a bit over on time, I need to tighten up the second half, and for some reason Quicktime dropped a few of my slides, but it should give you a rough idea of what I’m up to.

I’m planning a Monday morning road ride in Boulder before the event. Definitely not Super Walker this time, the presentation I gave the day after that had the lowest energy ever! Let me know if you’re interested and we’ll try and sort out a time. Anytime from University Cycle’s opening to pick up a bike to 12:00 I’m free.

How to find great public domain photos

8-cell-simple

One of my favorite parts of blogging is picking the images, and traditionally I've used Flickr's Creative Commons search to find some beautiful photos. For my Denver New Tech presentation I needed some historic photos, and that's one area where Flickr falls short. Most of the content comes from people's personal collections, and I really wanted an illustration of the Titanic sinking!

After doing some sniffing around, I discovered the WikiMedia commons collection. It's an amazing resource, a collection of over 4 million public domain pictures, animations and sound files. Sure enough I found just the image I was looking for:

Titanic

It's dangerous though, there's so much beautiful content I could have lost days just browsing. If you want to see some wonderful work, start with their quality and featured collections, or just search on a topic. Even better, they're all licensed under either the Creative Commons or are public domain, so you can reuse them for your own projects. You really have no excuse for boring slides any more!

I am a spammer

Spamwithbacon

Photo by Cobalt123

I'm just back from my four days off the grid, and I was surprised to find quite a few DMs from friends, some saying thanks, others a little puzzled. Looking through my own DM history I was mortified to discover twitter.mailana.com's auto-DM robot had misfired and spammed about 20 of my friends with links to their profiles. I hate, hate, hate getting robot emails or DMs from people I know, and never intended to do the same thing myself. If you expect someone to spend a few minutes of their time reading (and hopefully acting on) your message, the least you can do is take the time to type something unique yourself!

I immediately updated the code so that it won't happen again, and a big apology to everyone who got those messages. Twitter works surprisingly well with few rules, I don't want to be part of the Green-Card-Lotterization of the service.

If you're interested, here's what went wrong:

– If you search for your name on Mailana, and I haven't imported your messages yet, I bump you to the top of the queue. I also offer an option to follow me on Twitter, so that I can send a DM when that import's done.

– When I originally implemented this, import was a one-time thing, and I'd already imported all my real-life friends, and their friends, so I'd only be sending DMs to new followers.

– One of my top feature requests was continuous updating. A couple of weeks ago I added that, triggering an import whenever a profile was viewed that hadn't been updated in a few days. To avoid repeated DMs I track who I've sent messages to, and never send more than one. What I hadn't thought through was that the robot had never sent DMs to my original friends, so when somebody viewed their profile and triggered an import a message would get sent.

If I was starting over I'd create a separate account for twitter.mailana.com and only send DMs and announcements through that.

Dropping off the grid

Santacruzisland

Photos by Liz

For the next 4 days, me and Liz are going to be cut off from the modern world. Even though it's only 20 miles off the LA coast, Santa Cruz Island feels like another century. No cell reception, roads, or permanent inhabitants, it's a 100 square miles of wilderness and beauty.

We'll be staying with the park rangers, and spending our time digging out and cutting back overgrown trails. I know that's not everyone's idea of a fun vacation, but for me there's nothing more therapeutic than smashing through rocks with a pick-ax, and boy does that beer at the end of the day taste good!

An example of Tokyo Tyrant in PHP

Kenvsgodzilla

Photo by TCM Hitchhiker

To make some progress in my attempts to use Tokyo Tyrant from PHP, I finally bit the bullet and created a standalone script with no dependencies that stress-tests the interface. It also works as an expanded example of how to use Tokyo from within PHP. You can download tokyotest.php here.

It includes both the Net_TokyoTyrant raw socket interface and my work-alike clone Net_HttpTyrant that uses HTTP instead. The main tokyo_test() function stores a large number of values, retrieves them and checks they're correct, and then deletes them, timing performance. Here's my findings based on my own experiments.

TURN OFF ULOG! I wish the <blink> tag still
worked, it's that important. I only just discovered this as the root
cause of log files will eating my hard drive. The default ttservctl
script has the innocuous-sounding ulog option turned on by default.
This update log holds details of every transaction you make with the
server. This is great if you're replicating or need to do restores, but
these files grow rapidly and can easily fill up your hard drive! It
also helps performance quite a bit in my tests to disable it. Obviously if you need this kind of backup you can't turn it off, but you'll also need some strategy to avoid running out of space.

Keep connections open. The initial problem I hit was caused by opening and closing a lot of sockets rapidly. After a few thousand, either PHP or the system throws an error. Try these URLs on the script to test for the problem on your system:
tokyotest.php?interface=http&numberofkeys=40000&valuelength=8
tokyotest.php?interface=raw&numberofkeys=40000&valuelength=8&closeeveryop=true

Don't use the HTTP interface. HTTP is great for quick hacks, but it is tough to avoid opening and closing a lot of connections in PHP. I notice one of the Perl example scripts uses stay-alive to keep a persistent connection, doing the same in PHP might help a lot. HTTP is a lot more verbose than the raw sockets though, so if you're going to that much trouble it's probably simpler to use Net_TokyoTyrant instead.

Net_TokyoTyrant will truncate long values. An issue I haven't solved yet is that after a certain point, the current raw socket interface code will fail to store the end of long values. For example, if you store a 20,000 character string you'll only get back about 16,000 when you retrieve it. This isn't a Cabinet problem, since doing the same operation through HTTP works as expected. Here's a way to reproduce the problem:
tokyotest.php?interface=raw&numberofkeys=10&valuelength=100000

The same operation using the HTTP interface gives the expected results:
tokyotest.php?interface=http&numberofkeys=10&valuelength=100000

[Update- I've now tracked down what's going wrong, and have a fix for the PHP wrapper: http://petewarden.typepad.com/searchbrowser/2009/06/how-to-get-tokyo-tyrant-working-in-php.html ]

What is Mailana?

Questionmark

Photo by Charles Chan

I was putting together an email this morning introducing what I'm up to with Mailana, and I realized it would make a good blog post too. I've talked about a lot of this stuff in different articles, but never gathered it all together in one place. I'm also preparing for my talk at the Boulder/Denver NewTech meetup on March 23rd, so it's good practice for that as well.

For motivation, my unofficial tagline is "You guys should talk".
I'm driven by the last 5 years I spent at Apple, which was chock full
of smart people but had no system for connecting them to solve
problems. I want to be able to do things like find internal experts
based on email analysis and locate people with contacts at external
companies by building an opt-in company directory of skills. Since it
can be a tough sell to persuade companies to hand over their email data
to a startup, I've created http://twitter.mailana.com/
as a shop window for the technology. It's still early days but it lets
you visualize the actual patterns of conversations in Twitter in some
different ways.

Here's the technical background on Mailana: It's a system for
grabbing emails from all sorts of sources and doing server-side
analysis on the large data-sets you end up with. To grab the data I
have IMAP, Exchange, Outlook PST and Twitter import components. To
serve up the results I have a Facebook-style mini-app API with
HTML-based presentation components running through the browser, as
native Outlook tools, and in Sharepoint.

What do I need? I'm looking for progressive organizations interested in solving the sort of problems I'm tackling. I want to expand beyond my initial proof-of-concept pilots pulling data from Exchange and start tailoring the technology to address people's pressing needs.

Adding authentication to the SPIURL permanent Twitter portrait project

Contrastportrait

Photo by S~Revenge

Whenever a user changes their picture on Twitter the URL changes. This is a massive pain for applications like twitter.mailana.com that show user's image since it requires a lot of code to handle checking and updating the links. In an ideal world Twitter would offer a permanent URL for every user's portrait. That's on their roadmap, but until they update their API, Shannon Whitley's SPIURL project offers the next best thing.

You can either download the Python code and host it on your own free AppSpot account, or use Shannon's public http://purl.org/net/spiurl/ link. Josh Fraser extended the code to support large portraits and added some other useful tweaks like a content-type for browser viewing.

I've been happily using my own copy of SPIURL for the last couple of weeks, but a few days ago I started noticing broken image links again. After a bit of investigating, I found I was hitting a limit of 100 requests per hour. This never used to happen, so I assume something changed on the Twitter side. To fix this I've added authentication to the API call (along with some more error reporting). Here's the main change:


import base64

        #Enter your own account details here
        authString = "Basic " + base64.encodestring("yourusername:yourtwitterpassword")
        response = urlfetch.fetch("http://twitter.com/users/show/&quot; + _screen_name + ".xml", payload=None, method=urlfetch.GET, headers={"AUTHORIZATION" : authString}, allow_truncated=False, follow_redirects=False)

You can download the full code here. You'll need to change the authorization details to your own account, and ensure the account is white-listed. I'm still waiting for my rate limit to be bumped, so I'm not totally certain it's working, I'll update this when I am.