
I’ve checked in two big changes to the applet that prevent malicious scripts from causing harm, even if they make it through the script blocking, by removing cookies and blocking sites that don’t show up in a google search. The ability to access cookies and intranet sites is the pot of gold at the end of the rainbow for malicious hackers. Now the applet blocks that access, there hopefully shouldn’t be much point in trying to crack the other layers of security, such as script blocking.
First, I’ve written my own HTTP handler based on TCP/IP sockets, rather than using Java’s high-level functions. This means there’s no chance of cookies being sent with page requests. The high-level functions would automatically add them but my version doesn’t even have access to the cookie jar. So that removes the possiblity of accessing personal or secure information through cookies.
Second, I’ve limited the page requests to those either those that start with http://www.google.com/search? or that are contained in the last search results page returned from google. Since local intranet servers should not show up on google search results, this should ensure that only public servers are accessed.
These changes close down the two main dangers attributed to cross-domain XMLHttpRequests, sending private information with the requests, and intranet access. Chris Holland has an article talking about an informal RFC for cross-domain requests, and these are the key points he brings up. There’s also a good mailing list discussion he references that seems to agree.
My main difference with Chris is his assumption that the only way to ban intranet access is to rely on all accessible servers to implement a new protocol, explicitly allowing cross-domain access. I believe the technique I use, verification of the domain’s existence in a trusted directory of public servers, such as google’s search database, should be enough to exclude intranet sites.
I think there’s another reason for the general preference for an opt-in approach to cross-domain access, that pulling pages from another site without explicit permission is a Bad Thing. This is something I’ve given a lot of thought to (see Why the hell are you sucking my bandwidth for your mashup?), and my conclusion was that it depends on the context. I’m sticking pretty closely to how the original publishers intended their pages to be shown in a browser, ads and all, just providing a different work flow for searching.
I think this ability to remix and mashup external web-pages is something that can be abused, but also has huge potential to enrich user’s lives with new creations. It’s disruptive, and a bit subversive, but I think the world will be a better place with more mashups.