For string handling, watch out for the encoding! Most code examples use 8 bit ASCII strings, but Firefox supports Unicode strings, which allow a lot more languages to be represented. If we want a wide audience for our extension, we’ll need to support them too.
C++ inherits C’s built-in strings, as either char (for ASCII )or wchar_t (for Unicode) pointers. These are pretty old-fashioned and clunky to use, doing common operations like appending two strings involves explicit function calls, and you have to manually manage the memory allocated for them.
We should use the STL’s string class, std::wstring, instead. This is the Unicode version of std::string, and supports all the same operations, including append just by doing "+". The equivalent for indexOf() is find(), which returns std::wstring::npos rather than -1 if the substring is not found. lastIndexOf() is similarly matched by find_last_of(). The substring() method is closely matched by the substr() call, but beware, the second argument is the length of the substring you want, not the index of the last character as in JS!
DOM maniplation is possible through the MSHTML collection of interfaces. IHTMLDocument3 is a good start, it supports a lot of familiar functions such as getElementsByTagName and getElementById. It does involve a lot of COM query-interfacing to work with the DOM, so I’d recommend using ATL pointers to handle some of the house-keeping with reference counts and casting.
PeteSearch is now detecting search page loads, and extracting the search terms and links from the document, next we’ll look at XMLHttpRequest-style loading from within a BHO.