I just completed the installer, and so I’m now releasing the first public build of PeteSearch for IE:
It requires Windows 2000 or later, and Internet Explorer 7. Please give it a try, and let me know how you get on. I’ve updated the source repository, and I’ll be adding an article on how I built the installer soon.
Vlad also asked how a BHO can be added to the registry, outside of the compiler? This is a question I’m wrestling with as I write the installer, so here’s a dump of my current understanding.
There’s three purposes to registration; telling IE that there’s a BHO with a certain ID it should load, telling the system where the DLL is that contains that ID, and what code actually implements the interface we’ve declared.
The first part is covered by the .rgs file in the project. It handles setting up the registry so that our BHO is added to the list of ones to load. If you look at the file, it contains a description of what our DLL contains, including a UUID, and also adds that UUID to the list of BHOs for IE to load. It’s added to the DLL as a resource, and we reference that in our implementation class, by calling the macro DECLARE_REGISTRY_RESOURCEID(<resource ID>) in our class declaration. This adds an UpdateRegistry() member function to our class, that calls an ATL function that parses the rgs script, and adds the keys to the registry.
Telling the system where the DLL lives on disk is handled by calling code inside the DLL itself, in a process called ‘self-registration’. There’s a custom build step that calls regsvr32 /s /c <your dll>. This in turn calls the DllRegisterServer() function inside your dll, which in our case calls a standard ATL _Module.RegisterServer() call. We’ve set up an ‘object map’ containing our UUID, and the actual class that implements that interface, and the ATL call takes that and adds the right information to the registry. This call also adds the DLL location to the registry, and calls the UpdateRegistry() function we set up in our class, that adds the .rgs entries to the registry.
If you want to make this happen outside of Visual Studio, you can call regsvr32 yourself from the command line. Procedural installers like NSIS have script commands that do the same, like RegisterDLL.
As always, there’s a catch. WIX/MSI is a declarative installer, and MS strongly recommends against calling procedural DLL code as part of the process, since that’s a black box to the installation system, and so will be a lot harder to roll back. Instead, I’m going to have to capture a static description of all the registry changes that calling regsvr32 on my DLL causes, and add that to the installation script. Luckily, there’s a tool called tallow that looks like it may help. I shall let you know how I get on!
Vlad Simionescu was kind enough to share a version of the TinyBHO sample he adapted to compile on Visual Studio 6. There’s a couple of changes in the threading model he had to make, that I don’t understand well enough to intelligently comment on, but they seem harmless. It also pops up the URL of the document now, to demonstrate the use of BSTR’s, and some simple DOM access. Thanks for that Vlad!
I’ve just signed up for Defrag, a conference focused on the implicit web. In their own words:
Defrag is the first conference focused solely on the internet-based tools that transform loads of information into layers of knowledge, and accelerate the “aha” moment.
People often talk about information overload, and trying to cut down the amount of data people have to deal with. That approach leads to solutions where a computer tries to do part of the user’s mental processing for them, which is a slippery slope towards talking paperclips.
I want to give people more information, but in a form they can digest. I want to present something that all our wonderful pattern-matching circuitry can sink its teeth into. We’ve had millions of years of adaption to spotting pumas in the undergrowth, we should take advantage of that.
It feels like a lot of the Defrag folks are thinking along similar lines, so I’m hoping to meet some interesting people who are working at the same coal-face, and get advice and inspiration. Plus I’ve never been to Denver, so maybe me and Liz can combine it with a vacation!
I’ve added the last remaining missing parts, and PeteSearch running on Internet Explorer now has exactly the same features as the Firefox version! The source code is in CVS, or you can download the source as a zip here. To complete the feature set and reach alpha, I implemented search term highlighting and the summary text popups.
I’m planning on a substantial beta testing period to nail down the bugs, but before I release it as a binary I need to write an installer. Initially I was looking at NSIS, since I used that in the past, but it seems like .msi installers are the modern way to go, so I’m learning about wix, and hope to get something sorted out in the next few days. The only sticking point at the moment is figuring out how to handle the DLL registration, since apparently the code-based self-registration that the MS BHO examples use is heavily deprecated. Instead it looks like I’ll have to try and capture what the executed code is doing using a tool like tallow, and put that into a wix script. As always, I’ll let you know how that goes, and I’ll be adding some posts on other issues I’ve hit during the conversion.
To celebrate Independence Day, me and Liz are taking a few days off work, and heading into the desert. We’ll be camping out near Kanab, Utah, and we’re hoping to make it to the rock formation shown in the photo. It’s actually the reason we’re going, Liz heard about it through a magazine and was fascinated. It’s so delicate, the BLM only allow twenty hikers a day to visit. Since we didn’t book ahead, we’re hoping we’ll get lucky in the lottery for the ten permits they release on the day before.
We’re camping at Coral Pink Sand Dunes State Park, which looks like it has some amazing sights, before heading on to the main town. For the camp fires, I’ve picked up a flint and steel fire lighting kit. It’s not in the same league as starting fires by rubbing sticks together, but at least this will make me feel a bit more like Bear Grylls!
PeteSearch displays previews of search results in a split-screen mode, side-by-side with the results page.To do this in Internet Explorer, I needed a child window that would show and handle interaction with a web site. At first, it looked like it would be easy using AtlAxWin, but it turned out to rely on statically linking to the ATL lib, which isn’t present on the Express edition of Visual Studio. Having got this far into the port using the free version, I didn’t want to admit defeat, so I looked around for alternatives.
It turned out to involve a lot of COM boiler-plate, but with the help of Lucian Wischik’s example code, I was able to create a simple browser window class, and implement split-screen preview in PeteSearch.
You can download a zip of my latest source code here, or it’s available from through CVS on Sourceforge. There’s a class in there called CPeteWebWindow that lets you create a child window that handles rendering and navigating an external web page. Using Lucian’s example, it was surprisingly painless to implement. The process of actually adding a window to Internet Explorer turned out to be very hard, but that’s a story for another post!
Getting C++ code called when there’s document event is pretty complicated. You have to create a COM class that implements the IDispatch interface, package that into a VARIANT object, and then call the element or document interface to attach it.
I’ve put together a small sample project showing how to do it, a simple BHO that attaches a callback to the document onclick event. It contains a generic C++ helper class that implements IDispatch, and calls back to a user-defined function when the event occurs. You can reuse this class for handling any DOM events. Download it all here.
This is the same technique I’m using in the IE port of PeteSearch, and it’s working well. You do have to be careful of threading issues in your callback though, since you don’t know which thread will run it! My thanks go to Ian Hart from AppxWeb, he pointed me in the right direction with this MSDN forum post.
When you’re writing a Firefox extension, you can reference images and other files you package with your installer using the chrome:// URL protocol. This is really useful if you want to inject images into a page, since you can put the images inside your extension’s folder, and then create an image tag with the src set to something like chrome://petesearch/skin/magnifier.png.
Internet Explorer doesn’t have anything like this unfortunately. No problem, I thought, I’ll just add the images to the directory where the DLL is installed, and reference them from there. After trying that, I realized that the images were never being loaded, and though I couldn’t find any documentation to back this up, decided it was probably blocked by a security policy. Remote pages accessing files from the local disk, even if they’re just images, could theoretically be used as part of an exploit, or at least to access some information about the user’s file system. IE doesn’t know that the local file reference has been inserted by our BHO, so it blocks it.
I compared notes on this with Georges-Etienne Legendre since he was also hitting this problem. I’m developing on Vista, it appears that on XP you can still reference local image files on http pages, but not ones that use the https protocol.
Here’s the suggestions I’ve had on how to inject a local image into a remote page:
- Use the res: protocol to reference an image within the BHO’s dll. This was suggested on the MSDN extensions forum by Rob of IECustomizer.com. I haven’t tried this yet, but I’ve got a strong feeling that this protocol will be at least as restricted as file:, if not more, so I’m not holding out much hope.
- Write an Asynchronous Pluggable Protocol to implement something like data:. This was suggested by Georges-Etienne, apparently IE7Pro does something similar to solve this problem. It seems like it would be quite a lot of work, and I’m not sure about the details of how you could use it to solve the problem.
For now, I’ve decided to side-step the problem by hosting the images I need on my own server. This works fine, but it’s a bit wasteful of network resources, and I hoped to keep the extension from having any dependencies on a single server.
I’d love to hear any suggestions on other ways to tackle this, or more info on the security restrictions that cause the problem.