After getting the Enron emails into the mbox format, the next step was to convert them into something that the Outlook/Exchange world can understand. Thankfully I already had a great conversion program in mind, Aid4Mail. At its core its a translator between a large number of mail formats, including Outlook, Outlook Express, Windows Mail, Eudora, Thunderbird, Netscape Messenger, Pegasus Mail and a whole bunch of generic formats including several mbox variants. It can read and write to all of these formats, and has a large number of options to transform the mail as you do so. For example you can choose to only convert mails sent between certain dates, or to ignore attachments. If you're working with mail, I highly recommend giving this program a try, it's the swiss army knife of email tools.
To do the Enron conversion, I selected generic unix mbox as the input format. On the next screen I navigated to the root folder that contained all my files, and then chose Outlook pst as the destination type. I left all the other options at their defaults, so no filtering was done and the folder hierarchy was preserved. It took around 16 hours to process all 500,000 messages, and the pst file came out at around 5 GB.
I'm able to open it in Outlook and browse through the messages, and can also add them to my Exchange server. There are some issues, it doesn't preserve the original user structure, since they're all in one pst, attachments aren't included, and some of the addresses are obsfucated. It's good enough to give me the testbed I need to put some of my tools through some real-world stress tests.
Once the upload has finished, you should be able to access the pst yourself at
It's 5 GB, so it won't be all there for a good few hours, and be prepared for a long download time.