
I’ve checked in some changes to the SearchMash JavaScript, intended to entirely remove any script content from displayed pages. Previously I was just removing < script> tags, now I look for javascript: urls, eval calls and events (onload, etc). I’ve also added some ‘canonicalization’ (there should be a better word for that) steps to try and avoid some of the common workarounds for regex blocking, like inserting newlines or spaces.
I also changed over the way I work out if a link in a results page is a search link (and so should open in the left frame). This change should restrict the pages that get opened as search results to those on the google domain, previously it would have been possible for someone to set up a http://www.notgoogle.com domain and I’d open it there.
For both of these changes, I switched over to using regular expressions, since that’s a lot easier to understand than my previous logic.
I feel a bit better about the security of running external html in my domain now, I might try and get back to some feature work after this.
Now, I think I need to spend my remaining weekend with a margarita and a hot tub, and prepare for my real job tomorrow.