In my previous post, I presented some regular expressions you can use to spot dates, times, prices, email addresses and web links, along with a test page to see them in practice. REs can be pretty daunting when you’re first working with them, so I wanted to recommend a few resources that have helped me in the past.
The best overall guide on the web is regular-expressions.info, and I used some of Jan’s suggestions for email address matching. He has also written a very clever regular expressions assistant that breaks down any cryptic RE into a human-readable description. I also liked this python tutorial on REs, it’s focused on a good practical example and shows how you’d build up the expression step by step.
I’ve included some documentation in the library as comments, but the main entry point is the searchAndProcess() function. This takes three arguments, a regular expression to search for, a callback function you supply to create a new node to be the parent of the element that contains the matching text, and a cookie value that’s passed to the callback function so you can customize its behavior easily.
The callback function itself receives three arguments, the current document so it can create a new element, the results of the RE match, and the client-supplied cookie. The RE results are the most interesting part of this, since they’re the same format that’s returned from the JS RegExp.exec() function. They’re an array where the first entry is the full text that’s matched by the expression, but then subsequent entries contain the text that was matched by each sub-set contained with parentheses. This means I can use the second, third and fourth array entries in the phone number callback to create a number that excludes any spaces or separator characters. Here’s an example of that in practice from the test page. View the entire page’s source to see more examples of how to use it. The cookie is used to pass in the protocol to use for phone number links, usually ‘callto:’.
function makePhoneElement(currentDoc, matchResults, cookie)
var anchor = currentDoc.createElement("a");
anchor.href = cookie+matchResults+matchResults+matchResults;