Need a custom Internet Explorer or Outlook plugin?

Wisconsin
Wisconsin photo by James Jordan


I recently came across Gigasoft Development, a small firm that specializes in writing IE and Outlook plugins. This is the first group I’ve come across that is solely focused on these, and whilst I’ve never used them myself, their work seems impressive.

If there’s any part of your software development you’d want to contract out, it’s writing extensions for Microsoft products. I know from my own explorations that it’s an incredibly deep field, with undocumented gotchas everywhere you turn. It’s a waste to devote months of your own engineering schedule relearning all those lessons if it’s not part of your core technology. It’s pretty rare to have good web developers who can also handle the hard-core Win32 hacking too. Contracting out to a good team of people who already know where the booby-traps are means much quicker and cheaper development.

You can often follow a pattern where the plugin itself is just a thin shim that fetches and renders HTML from a URL you control. That gives you the flexibility and ease of web development for the UI aspects, and means you can update the application logic without touching all those installed plugins.

I also have a soft spot for Gigasoft after looking through their site and spotting that Tom’s a Packers fan from Wisconsin, and they’re based in Illinois. I always love visiting Chicago and Wisconsin when we fly back to see Liz’s family.

Why aren’t we using humans as robots?

Robot
Photo by Regolare

Yesterday I had lunch with Stan James of Lijit fame, and it was a blast. One of the topics that’s fascinated both of us is breaking down the walls that companies put up around your data. In the 90’s it was undocumented file formats and this decade it’s EULAs on web services like Facebook. The intent is to keep your data locked in to a service, so that you’ll remain a customer, but what’s interesting is that they don’t have any legal way of enforcing exactly that. Instead they forbid processing the data with automated scripts and giving out your account information to third-party services. It’s pretty simple to detect when somebody’s using a robot to walk your site, and so this is easy to enforce.

The approach I took with Google Hot Keys was to rely on users themselves to visit sites and view pages. I was then able to analyze and extract semantic information on the client side, as a post processing step using a browser extension. It would be pretty straightforward to do the same thing on Facebook, sucking down your friends information every time you visited their profile. I Am Not A Lawyer, but this sort of approach is both impossible to detect from the server side and seems hard to EULA out of existence. You’re inherently running an automated script on the pages you receive just to display them, unless you only read the raw HTTP/HTML responses.

So why isn’t this approach more popular? One thing both me and Stan agreed on is that getting browser plugins distributed is really, really hard. Some days the majority of Google’s site ads seem to be for their very useful toolbar, but based on my experience only a tiny fraction of users have it installed. If Google’s marketing machine can’t persuade people to install client software, it’s obvious you need a very compelling proposition before you can get a lot of uptake.

An easy way to install your Firefox extension

Clickhere

Firefox’s biggest selling point is its security. Unfortunately for third-party developers, this means that users have to do several awkward steps before they can install a Firefox extension from an internet site, to protect them against malicious code. The best way to avoid this is to get your extension on the main add-ons site, addons.mozilla.org, since that’s trusted by default and your users won’t have to navigate any tricky security dialogs. There are some issues with this though. Since it requires a vetting process it can take weeks to months to get an extension added. It’s also possible that your extension doesn’t meet the criteria for inclusion if it’s specific to a particular product or niche market, rather than something that’s appropriate for the general public.

If you do need to install from your own site instead, you’ll need a way of guiding your users through the security process, and I’ll cover a technique I’ve found effective. Firefox extensions are packaged in .xpi files, which under the hood are just zip files with a special layout. To start installation, you just need to create a link to the .xpi file on your site, and Firefox will recognize the type when the user clicks on it. Because the site won’t have the right security privileges, the first thing the user sees will be this security warning at the top of their window, and installation will be blocked:

Firefoxwarning

To restart installation, the user has to click on edit options, which brings up this dialog:

Firefoxdialog

They then have to click on ‘Allow’, and click on the install link again once the dialog has closed. As you can imagine, it’s easy to lose users along the way with this multi-step process. I’ve found that providing a visual aid to guide them through it seems to help, using Javascript to draw an arrow pointing to the ‘Edit options’ button and providing brief instructions next to it:

Clickpicture

I can’t claim credit for the idea, I first saw it with me.dium‘s extension, but I ended up writing my own version for GoogleHotKeys before it was accepted onto the official Mozilla site. It works by intercepting the install link mouse-click, revealing the guide at the top of the page and then trying to install the extension through scripting, which brings up the security warning it points to. Here’s a link to an example page showing the code in action
(you’ll need an image like this
for it too), and I’ve included the code below. You’re free to reuse this for your own projects.

<html><head>

<meta http-equiv="content-type" content="text/html; charset=UTF-8"><title>PeteSearch</title></head><body bgcolor="#eeeeee">

<script type="application/x-javascript">
<!–
function installInitialTry (aEvent)
{
    showInstallEnable();

    return attemptInstall(aEvent);
}

function attemptInstall(aEvent)
{
    var params = {
        "PeteSearch": {
            URL: aEvent.target.href,
            IconURL: aEvent.target.getAttribute("iconURL"),
            toString: function () { return this.URL; }
        }
    };
    InstallTrigger.install(params);

    return false;
}

function showInstallEnable()
{
    if ((document==null)||(document.getElementById==null))
        return;

    var content = document.getElementById("click_here_content")
    if (content!=null)
        return;

    var placeholder = document.getElementById("click_here_placeholder");

    placeholder.innerHTML =
    "<table align=\"center\" bgcolor=\"#ffffff\" border=\"0\" width=\"100%\" id=\"click_here_content\"><tbody>"+
    "<tr>"+
    "<td align=\"right\"><p><font size=\"+2\"><b>Click here to enable installation<br>"+
    "and then click <a href=\"http://petesearch.com/petesearch.xpi\" iconurl=\"iconsmall.png\" onclick=\"return attemptInstall(event);\">here</a> to install"+
    "</b></font>"+
    "</p></td>"+
    "<td width=\"116\" align=\"right\">"+
    "<img width=\"116\" height=\"165\" src=\"clickhere.png\"></td>"+
    "</tr>"+
    "<tr>"+
    "<td align=\"right\"><p><font size=\"+2\"><b></p></td>"+
    "</tr>"+
    "</tbody></table>";

}
–>
</script>

<div id="click_here_placeholder">

</div>

<div align="center">
<a href="http://petesearch.com/petesearch.xpi&quot; iconurl="iconsmall.png" onclick="return installInitialTry(event);">
install
</a>
</div>

</body></html>

How to handle file dragging in a Firefox web app

Drag

One of the things I miss most when moving from a desktop app to the web is the ability to drag and drop documents between programs. The default file open dialog within a form is definitely not an adequate substitute. The best you can manage with a plain web app is dragging elements within the same page.

To add the full functionality to a web application, you need to install some client-side code. In Firefox, the easiest way to do this is with an extension, though a signed JAR file containing the script is also a possibility. I haven’t tried to do it in IE yet, so that will have to wait for another post.

Here’s an example extension, with full source code and a test page demonstrating how to use it. To try it out:

  • Install draganddrop.xpi in Firefox
  • Load testpage.html
  • Try dragging some files onto the different text areas on the page

You should see an alert pop up with the file path and the element’s text when you do this. The extension adds a new event type to FireFox; "PeteDragDropEvent". When a file is dragged onto a web page, it sets the element underneath the mouse’s ‘dragdropfilepath’ attribute, and then fires the event on that element. If the element has called addEventListener for that event previously, then its defined handler function will be called, and the script can do what it needs to.

The main drawback is that you only get access to the local file path for the dragged object, and there’s not much an external web script can do with that. I’ll cover the options you have to do something interesting, like uploading a file to a server, in a future post.

This page was invaluable when I was developing the extension, it has a great discussion with examples of the mechanics of Firefox’s drag and drop events. One thing to watch out for if you reuse this extension for your own projects is that you don’t want to open up dragging-and-dropping for all pages. That would be a possible security problem if malicious sites lured users into dragging onto them. Instead you should do some URL white-listing to make sure only trusted locations are allowed, being careful to properly parse the address so that spoofing with @, etc, won’t fool the test.

Developing an IE plugin compared to a Firefox extension

Goatvsgoat

Yvonnick Esnault mailed me to ask how long it takes to develop an extension for Internet Explorer compared to one for Firefox, and where he could learn more about the differences between the two.

I don’t know of many resources that will help, which is why I started my series on porting a Firefox plugin to Internet Explorer. That doesn’t include a summary of the differences though, just an examination of the details, so I’ll try to sum up what I know.

Compiled versus interpreted

IE plugins are Browser Helper Objects, which are effectively special DLLs. These require the use of a compiler, and creating a new plugin or modifying an existing one takes a lot longer than if you’re writing in Javascript for a Firefox extension. It does have one advantage though; you have the option of keeping the source code closed with IE, whereas it’s inherently open-source with Firefox’s Javascript.

Writing an installer

Firefox plugins have a dedicated installer system that you can write some simple scripts to interface with, and use very quickly. You don’t get any help from Internet Explorer, instead you have to write a standard Windows installer executable, which can be pretty time-consuming.

Performance

One other benefit of being compiled rather than interpreted is that processing-heavy operations tend to run a lot faster. I notice this when I’m doing things like string searching within a document.

Firefox development is a well-trodden path

There’s a lot of people creating Firefox extensions, there’s very few creating IE plugins. That has a lot of consequences:

  • There’s much more documentation for FF extensions, both from Mozilla and developers themselves, and it’s easier to get help.
  • There’s fewer obscure bugs in Firefox than IE, because the interface is heavily tested in use.
  • Firefox has addons.mozilla.org to distribute extensions. The IE equivalent, Windows Marketplace is not as well-known or promoted.
  • Since there’s fewer IE extensions than Firefox ones, there’s less competition for users.
  • Unfortunately, many users who want plugins use Firefox rather than IE already because of the lack of IE extensions!

Overall, developing plugins for IE is a lot harder than developing for Firefox, and it took me a couple of months of weekends and evenings to convert mine over. If you do decide to do it, I’d recommend looking at and adding to the BHO documentation wiki.

Funhouse Photo User Count
: 761 total, 68 active. Growth a little slower today, still within the rough range I’ve been seeing for the past couple of weeks.

GoogleHotKeys version 1.01 released

Sunrise

I’ve just uploaded the latest version of GoogleHotKeys for both IE and Firefox. The main site links to the addons.mozilla.org site for Firefox, and that may take a day or two to be updated. You can download it directly here until then. Changes include:

  • Pressing N takes you to the next page of search results
  • I’ve disabled the arrow keys from moving you between highlighting terms, since that sometimes was unhelpful
  • Fixed a few assorted bugs, such as the IE version forgetting which link was selected when you returned to a results page, and FF not correctly ignoring the Desktop search link in results pages.

It went very smoothly, apart from the final step of persuading WIX to create an upgrade installer for the IE addon. I assumed that this would just involve updating the version number, but it turned out to be a bit of a rabbit hole. I ended up cheating, and changing the installer GUID, which will result in some duplicate files on disk for upgraders, and a duplicate entry in add/remove programs, but seems to work.

Welcome hackszine readers!

Pylon

Jason Striegel over at hackszine, the blog of Maker magazine, has been a big supporter of my hacking with Google, and has just published an update on my IE porting work. He mentions the wiki I’ve set up to shed light on the obscure world of IE plugins, and you can look forward to lots of other fun stuff on the Facebook API here as I learn more about it. Thanks for the mention Jason!

BHOs and threads

Threads2

Vlad Simionescu asked me some questions about how BHOs behave with threads. This isn’t an area either of us have been able to find documentation on, so I’ll just have to give a description of what I’ve seen in practice. I’ll be posting a request to the MSDN forum to see if anybody in a position to give a definitive answer can correct anything I’ve got wrong.

Internet Explorer uses a single-threaded apartment model for threading, as I discovered in a post from Tony Schreiner. Exactly what this means, I have no idea, since I’m just used to the plain unix pthreads, but I’m sure some googling would resolve the differences between the various windows thread models.

In practice, it appears that each window (pre-IE7) or each tab (IE7) has its own thread. Since there’s a BHO instance created for every browser instance, this means that there’s a one-to-one mapping between each instance of your BHO, and a thread. It seems like every Invoke() call to an instance of a  BHO is made on the same thread, and that thread is the one associated with the browser tab/window.

This is important, because as Tony’s post explains, you can’t use COM interfaces on different threads without jumping through some hoops. This allows you to store pointers to COM objects that will be used from descendants of your Invoke() method, without losing sleep about possible threading errors. It seems like you only have to worry about making your code thread-safe if you create your own threads. You do need to be able to cope with multiple BHO instances running simultaneously on different threads, but this should be trivial as long as you’re avoiding global variables.

This one-thread-per-BHO behavior is implicitly relied on in both Sven Groot’s examples and my work on hooking into the windows messaging procedure. We both use the current thread to work out which BHO we should pass events onto, since there’s no other way to map a window procedure call with a plugin.

More posts on porting Firefox add-ons to IE

Refresh and DISPID_DOCUMENTCOMPLETE

Waterfall

One of the great unsolved mysteries of BHOs is how to catch refresh events. When the user refreshes a page, DISPID_DOCUMENTCOMPLETE and DISPID_NAVIGATECOMPLETE are not sent, as they are with a normal load. Add-ons rely on that to know when they can start changing the DOM, so it’s a pretty serious problem.

There are some suggested solutions that use the DISPID_DOWNLOADCOMPLETE event, which is sent on a refresh, but only after all images have been downloaded. Since you get one of those for every document complete, you can do some jiggery-pokery to spot if you get one of those without a corresponding navigate complete to spot refreshes.

It seemed like there had to be a more reliable way than this, so I’ve cooked up a solution that detects refresh events directly. You can download example source code here.

It works by attaching a hook to IE’s main window procedure using
SetWindowsHookEx(WH_CALLWNDPROCRET, …);
This calls back to the function we specify, after every call to IE’s message loop. By looking at what happened during a refresh, I spotted that a WM_COMMAND message is always passed, with a LOWORD(wparam) of 0x1799 for a refresh caused by pressing F50xa220 for one triggered from the main menu, and 0x179a for the context menu. These values are consistent across both IE 6 and 7.

I’ve set up my hook function to get called after the app’s message handler, so when I see a refresh command has just gone through, I override the default refresh behavior by explicitly calling IWebBrowser2::Navigate() to the current page’s URL. This causes IE to go through the normal loading events, so BHO’s now receive the usual document complete call.

Here’s the pros and cons of this new approach:

Pros:

Simplicity. Compared to the book-keeping needed for the event counting approach, it’s a lot easier to code.
Identical to normal loading. A refresh now triggers exactly the same page-loading events as loading a new page.
Right time. The event counting approach only detects a refresh when it sees a download complete event, which can be some time after the document is actually ready.

Cons:
Voodoo-esque. Relying on details of IE’s internal implementation is risky. The fact that it’s been consistent for several years makes it less scary.

Redundancy. If more BHOs use this approach, you could have the code called multiple
times for a single refresh. In practice this doesn’t seem to cause any noticeable problems.
Misses script refreshes. If a script calls window.location.Reload(), this won’t be detected.
Subverts
IE’s refresh behavior
. This is probably the most serious issue, since I
know there’s multiple meanings for refresh, depending on whether shift
is held down, etc. In production code, I’ll probably limit the forced
reload to pages where I needed it (eg search results for PeteSearch)

Overall, this seems better than the alternatives for my purposes. I’d prefer to avoid the whole thing, but since the bug’s been a known issue for at least four years, I can’t rely on it being fixed soon.

I also looked into some techniques that relied on adding a handler to the document or window’s onload event. Unfortunately using attachEvent(), there was no call back to the handler on a refresh, even though script handlers within the page do get called back. It’s possible that setting the onload handler for the window explicitly would have worked, but this can be overwritten by scripts, and so isn’t reliable.

More posts on porting Firefox add-ons to IE

Wiki guide to writing BHOs

Hawaii

I’ve had a really good response to my series of articles covering the basics of writing an add-on for Internet Explorer. It led to some really interesting discussions with other people who are working in the same area and finding it hard to discover good documentation. It feels like we’ve all got different pieces of the puzzle, so to help gather that knowledge together, I’ve set up a wiki:

http://petesearch.com/wiki/

I’ve started it off with articles on the basics of creating a BHO, and some of the quirks and issues I’ve run into. It’s publicly editable, so I’m hoping that you folks will help add to, correct and improve it!