The core of Google’s success is the order it displays search results. Back in the pre-Google days you’d get a seemingly unordered list of all pages that contained a term. Figuring out which pages were most authoritative using PageRank and putting them at the top made finding a useful result much quicker.
Searching emails needs something similar, a way of sorting out the important emails from the trivial. PageRank works by analyzing links between pages, but emails don’t have links like that. Instead, you need to use other connections between emails, such as how often a message was replied to and forwarded. Just as a link to another web-page can be seen as a vote for it, so an action such as forwarding or replying is a hard to fake signal that the recipient considers the message worth spending time on.
I’m already using this principal to set the strength of connections between people in Outlook Graph, the thickness and pull of a line is determined by the minimum of the emails sent and received between them. Using the minimum helps to weed out unbalanced relationships such as automated mailers that send out a lot of bacn, but never get sent any email in return.
It’s not a new idea, Clearwell has been using something similar for a while:
"To sort messages by relevance, Clearwell’s program weighs the
background data and content of each email for several factors,
including the name of the sender, names of recipients, how many replies
the message generated, who replied, how quickly replies came, how many
times it was forwarded, attachments and, of course, keywords."
It’s obvious enough that I don’t doubt other people are doing something like this too, though I’ll be interested to discover what patent landmines were laid by the first people to file. Where it gets really interesting is when you also do social graph analysis, then it’s actually possible to throw the social distance of the people involved into the mix. The effect is to give more prominence to messages from those you know, or friends of friends, since they’re more likely to be talking about things relevant to you than strangers.
As management consultants we have been doing this type of social network analysis for many years. Here is an email analysis from the late 1990s to help a large company get a major project unstuck.
Thanks for the heads-up Valdis, I was already planning on profiling your work in one of my upcoming posts. I’m especially interested since you’re using this sort of network analysis to solve real-world problems.
Great. Apart from this users can use tags to tell them which mails are important. This can be used by the software to rank for importance.
That is true. I’ve mostly been focused on techniques that require no user input, but getting that sort of additional information would be a big help. I’ll have to see if I can get some figures on the popularity of tagging for email. Anecdotally, I see a lot more people manually organizing emails into folders rather than using tags, but that’s from a small sample.