Try out OpenCalais’s semantic analysis for yourself

Calaisferry
Photo by graphistolage.com

I’ve been intrigued by the promise of automatically extracting information from raw text using semantic analysis, but I’ve never found a publicly-available component I could integrate into my own work that was good enough to get excited about. When OpenCalais was released I wanted to give it a spin, but there wasn’t a demo page available to run tests with. I’ve taken some of the PHP demo code they’ve released, added some robot-deterrent and put it online at http://funhousepicture.com/calaisdemo/

To use it, copy-and-paste some text, answer the CAPTCHA test, and click on Show Results. You should see some of the places, people and technical terms highlighted. If you mouse over, it shows what kind of object it is. You can download the source to my version of the demo here, though you’ll need to grab your own reCAPTCHA keys before it will run.

Give it a try for yourself and let me know what you think. I’m primarily interested in automatically tagging business emails, and from my tests it’s got some promise. It didn’t seem to mistakenly identify many items in my material, but there were a lot of nouns its not designed to handle. I’d love to see something that understood dates, addresses and locations, but it doesn’t do a great job with these yet.

I’ll be running some more bake-offs figuring out what off-the-shelf semantic technology can do these days, so stay tuned.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: