The page it works on is made up of three frames, the left one for the search results, the large right one for the preview of pages in those results, and the small far right one to hold the applet. The far right one is needed because some browsers require that any applet be visible before it can be run, for security reasons.
The script , searchgui.js, is included by the head of the main document, the one that defines the frames. There’s a fair amount of jumping through hoops to allow script communication between the different frames, one of the universal browser security measures is preventing frames getting access to others from different domains.
The script does some onLoad voodoo to get around the explorer patent restrictions that MS implemented, to prevent the user having to click on the applet to activate it, but the real work doesn’t start until the applet has signalled to the script that it’s loaded. This is done by calling SB_NotifyAppletLoaded_Forward() in the applet’s frame, which then calls SB_NotifyAppletLoaded() in the main script.
This triggers the fetch of the starting search page. One pattern I use a lot to invoke script actions inside functions called from the applet iis setTimeout() with a short duration. This is mostly to get around problems I ran into when I was calling back into the applet from script functions invoked by the applet. In a couple of places, they are a bit more of a fudge, and used as a way of trying to make sure that parsing or loading is complete before continuing. This second usage is pretty fragile, and should be replaced with explicit status checking.
The SB_StartSearch() function looks at the input URL, and if there’s any arguments, constructs a google search link that passes them on. If there’s no arguments, the default google start page is used. The URL argument checking is there so that external plugins, like the search bars in Firefox and Safari, can call SearchMash directly. I had to tighten up the argument passing though, previously I allowed the full URL to be specified there, now I make sure it’s a google search one.
SB_GetSearch() sends a request for the search URL, and the script waits until it’s recieved.
The applet calls back SB_PageRequestDone_Forward() in it’s frame, which calls SB_PageRequestDone() in searchgui.js. The first thing the function does is some magic to convert the returned objects from Java strings to JS ones. This only seems necessary in Safari, otherwise the objects only respond to Java string methods, which are close enough to JS that debugging why some JS ones were failing got very confusing.
The URL is checked to see if it’s a google search link, which should end up in the search frame, or an external page that was returned as part of the search results. The check is done in SB_IsSearchLink(): if it’s from a google domain, then it’s a search result.
If it is destined for the search frame, then a < base > tag pointing to the original page is added to the HTML, since we’ll be dumping the contents of the page into our own frame with a mashproxy.com location, we need to let the browser know to resolve the relative paths back to the right location.
The next step is a bit odd. As a security step, browsers don’t let you monkey with the DOM of a frame once you’ve loaded it, even if it’s the same domain, so I have to add a script into the HTML to do several things:
- Add some text after external links in the page, indicating if they exist
- Work out if a URL is present in the search page
- If the mouse is over an external link, invoke the preview frame
- If it’s an external link, fetch its contents to see if they exist
- If the user clicks on a search navigation link (next page, etc), open it through SearchMash
- If the user submits a new search through a form, route that through SearchMash too
After the script is added, document.write() is called to setup the frame, and the SB_StartDocumentParsing() function is called to start the customization of the search page and checking of external links. This is one of the more dubious uses of setTimeout(), using an onload handler would be better, but I didn’t want to interfere with the script in the search page.
Once an external link is returned from the applet, it goes through the same SB_PageRequestDone() as search pages, but since it’s not from a google domain, it gets handled in the second branch. The first thing it does is check that the external link is present in the current search page, if it isn’t, it doesn’ t do any processing on it. This is both to improve performance, and to make it harder if an external script finds a way to request pages.
If it is found, the status text next to the link in the search page is set, and the contents of the page are stored for later use in the preview frame.
The last piece of functionality is invoked when the user moves her mouse over an external link. The event handler in the search page calls SB_SetPreviewFrame(), and this either sets the location of the frame to the external link, if it hasn’t been loaded by the status checking yet, or if the contents have been stored, writes them into the frame.
This step is the one most vulnerable to abuse, since I’m writing external HTML into a frame with my domain’s privileges. To protect against malice, t’he SB_RemoveScripts() goes through the contents before they’re added, and removes any scripts, using a regex based blacklist.