Ethnography and software development

Mask

Ethnography, which literally means people-writing, is one way anthropologists study communities. They write very detailed and uninterpreted descriptions of what the people living in them do and say. These raw and unstructured accounts can be almost like diaries.

Most computer services are doomed to fail because they’re based on wishful thinking about the way their users should behave, not how they actually work. This is where ethnography comes in. Just like good user statistics, it forces you to stare the gloriously illogical humanity of people’s behavior square in the face. Only after you’ve got a feel for that can you create something that might fit into their lives.

Defrag was full of visionaries, academics and executives, people used to creating new realities, and changing the way people work. One of the most useful open spaces I attended was a session on how to bring web 2.0 tools into big companies. What amazed me was their visceral loathing of the imperfect tools already being used. My suggestion that there might be some valid reasons to email a document as an attachment rather than collaborating on a wiki was met with a lot of resistance. An example is the fine-grained control and push model for distribution that email offers. Senders know exactly who they’re passing it to, and though those recipients can send it on to others, there’s a clear chain of responsibility. With a wiki, it’s hard to know who has access, which is tricky in a political environment (ie any firm with more than two people).

Digging deeper, especially with Andrew McAfee, it felt like most of the participants had encountered these arguments as smokescreens deployed by stick-in-the-muds who disliked any change, which explained a lot of their hostility to them. It felt to me like the reason they are so effective as vetoes to change is that they contain some truth.

This smelt like a interesting opportunity, with a chance to take useful technology currently packaged in a form only early adopters could love (where’s the wiki save file menu and key command?) and turn it into something a lot more accessible to the masses, requiring few habit changes.

As a foundation for thinking about that, here’s some psuedo-ethnographic observations on how I’ve collaborated on written documents. They’re written from memory, and I’ve structured them into a rough time-line, and conglomerated all my experiences into a general description. This makes it not quite as raw as real ethnography, but it’s still useful for me to organize my thoughts.

  • Someone realizes we need a document. This can come down as a request from management, or it can be something that happens internally to the team. The first step is to anoint someone as leading the document creation. This often happens informally if it’s a technical document, and often ends up the person who’s identified the need. If it’s a politically charged document, the leader is usually senior, and often someone with primarily management duties.
  • Then, they figure out if it’s something that can be handled by one person, or if it really needs several people’s inputs. There’s a difference between documents that are shared for genuine collaboration, and those which are passed around for politeness’s sake, without the expectation of changes being made. Assuming it’s the first kind, there will be a small number of people in the core group who need to work on it, very seldom more than four.
  • If it’s documenting something existing, somebody will usually prepare a first draft, and then comments are requested from that small, limited group.
  • If controversial technical decisions or discretion are involved, then the leader will often do informal, water-cooler chats to get a feel for what people are thinking, followed by a white-board meeting with the core group. An outline of the document is agreed, with notes taken on somebody’s laptop, and often emailed around afterwards.
  • If someone’s trying to sell the group on an idea, they may create a background document first. This is usually sent as an email, with either content in the message, or a wiki link.
  • In most cases, the leader’s document is emailed around to the core group for comments. No reply within a day or two is taken as assent, unless the leader has particular concerns, and follows up missing responses in person.
  • For very formal or technical documents, changes will be made in the document itself. More often the comments will be made in an email
  • thread, and the leader will revise the document herself with agreed changes, or argue against them by email, or talking directly to the person.
  • Documents being collaborated on are rarely on the wiki. Word with change tracking enabled is the usual format. Standing policy and status documents are two big exceptions. They’re almost always on the wiki, but may not appear there until they’re agreed on.
  • Final distribution is usually done by email. For upwards distribution to management, this will be as a document or email message. For important ones, this is often only a short time before or even after a personal presentation to the manager who’s the audience, to manage the interpretation.
  • For ‘sideways’ delivery to colleagues, a wiki link may be used, though the message is still sent by email, and might be backed up with an in-person meeting.

This is just a brief example, but looking through these raw notes, a few interesting things leap out at me. Who sees the document at each stage is important to the participants. It’s possible to argue that this is a bad thing, but it’s part of the culture. We aren’t using our wiki for most of our document collaboration, it’s still going through email and Word, which might be partly connected to this.

It’s a useful process, it’s a way of looking at the world that helps me see past a lot of my own preconceived ideas of the way things work, through to something closer to reality. Give it a shot with your problems.

PNG loading made simple

Emptyframe
I need to load some images from disk, and use them as OpenGL textures. This is something I’ve reimplemented many times on different systems, and it’s always seemed more complicated than it should be.

The first choice is whether to use the native OS’s image loading libraries, or a cross-platform framework. Native libraries are a lot easier to start using, but make the code harder to port to other systems. Using an open-source framework makes your code much more portable, but it’s often a pain to find or build against the right version of the code, and you also have to deploy it to the end-users’ systems. I lost several hours wrestling with libpng a few weeks ago, since it refused to build on OS X, and I ended up yak-shaving my way down a dependency chain of other libraries it needed.

This seems overkill for most of the tasks I’m trying to accomplish, since I’m usually just trying to load a small, fixed set of images, which I’ve created and so can control their properties. This only uses a tiny part of most image loading libraries, since they’re designed to cope with reading and writing dozens of different formats, and hundreds of combinations of bit-depth, compression and other flags within each of those, along with lots of platform-specific optimizations to improve performance.

Happily, I recently came across the LodePNG project, which offers a simple, dependency-free way of loading PNGs. The PicoPNG module is a standalone C++ file that will load 8 bit, 4 channel PNGs, and is only 500 lines of code. This is perfect for my needs, there’s no extra installation hassles or dependencies, and it’s cross-platform. I also appreciate the elegant way that Lode Vandevenne packed his solution into such a small amount of code. It’s compact, but not obsfucated, very classy work.

Funhouse Photo User Count: 1,933 total, 78 active. Still getting quite a few users from the profile box add link, according to my statistics.

Event Connector User Count: 87 total, 4 active. Very quiet, I haven’t been actively chasing up any leads for a few days.

Facebook debugging in PHP

Mantis
I’m using PHP to develop my facebook apps. I’ve never done server-side development before, and one of the more awkward aspects is debugging. Luckily, I have done plenty of other remote debugging before, and a lot of the general techniques cross over. Here’s what I use at the moment:

Error logging to a file

PHP writes out a message to /etc/httpd/logs/error_log whenever it hits a syntax error. I generally run this command line to see the latest errors:
tail -n 500 /etc/httpd/logs/error_log

You can also insert your own logging messages into the file using the error_log() function.

Error logging to the HTML output

Sometimes it’s quicker to see the errors right in the browser as you try to load the page. I don’t use this much for syntax errors, but I do often add print() statements to show debugging information. One of the most useful tools is the print_r() function, which displays all the internal elements of arrays and objects.

You can enable inline error reporting by changing the php_flag display_errors line in your php.ini file to ‘on’ if you want to see syntax errors too.

Ajax errors

Debugging Ajax-style server requests was very tricky. One way I found in Facebook was setting the response type to RAW, and then setting that response text directly into a div. Even better was using FireBug, a great extension for Firefox that gives you a lot of information about what the browser is up to.

Choose Tools->Firebug->Open Firebug from the main menu, and click the checkbox in the lower pane that appears to turn it on. Navigate to your ajax page, and click on the Console tab in the FireBug pane. You should now see a POST entry appear for every ajax call you make, with the time it took to get a response. Twirling that open shows a complete description of what you sent, and what the server returned, in Headers, Post and Response tabs.

Advanced debugging

My needs are simple enough that I haven’t moved beyond these basic logging techniques, but you can check out this page if you’re interested in stepping through code and inspecting variables using a traditional debugger with PHP.

Facebook App User Counts:
Funhouse Photo – 917 total, 57 active. Still very unimpressive growth.
Event Connector – 4 total. I’m giving Google adwords a try, to see if I can reach event organizers with he keywords ‘facebook event promotion’, and a very low budget.

The Planning Fallacy and Software

Dontbelate

The Overcoming Bias blog had a great post on the Planning Fallacy a few days ago. They’ve got some great psych experiments to back the term up, but it boils down to people being inherently bad at figuring out how long a task will take. We always underestimate!

This may sound familiar to anyone who’s worked in software. Guy Kawasaki’s rule of thumb is add six months to the worst case shipping date you’re given, and that sounds right to me. What’s interesting about the post is it not only documents the problem, they also offer a solution.

They describe the normal way people create time estimates as inside planning. This is where you look at the tasks you need to do, create estimates for each of them, and total them to get the final estimate. There’s no time for the tasks you forget, or anything unexpected. What’s interesting is that even when asked to produce a worst-case estimate, people don’t allow enough time.

Their solution is to use outside planning. For this you ignore the unique aspects of the project, and instead try to find a similar completed project, and look at how long that actually took. This usually produces a far more realistic estimate.

This sounds right to me, one of the strengths of an experienced team is that they have a lot of previous projects to compare against. It’s a very strong argument to say ‘well, we’re barely at alpha, and it took us six months to get from here to shipping on project X, so we need to rethink releasing in two months’. If both the team and management were involved in project X, it’s hard to ignore that.

Of course, one of the strengths of a greenhorn team is that they aren’t as cautious! They’re likely to over-commit, but with smart management scaling back the tasks once reality kicks in, they might still produce a better project overall.

Occasionally too, the Captain Kirk/Scotty management/engineering dynamic actually works; "Captain, it will take two weeks to fix!" / "Scotty, get it done in the next thirty minutes." Sometimes that pressure will force a rethink on the engineering side about how to fix something. Maybe there’s a quick hack that will solve 80% of the problem?

Funhouse Photo User Count: 885 total, 72 active daily. I think growth’s actually slowed since my recent changes, so I’ll definitely need to instrument and analyze some usage statistics to try and work out why, and also take a fresh look at the user experience.

Google Hot Keys Download Count: I’ll be occasionally showing my GHK download numbers too, since they’re growing pretty well. I’m up to 4894 total so far on the main Mozilla site. I’m not seeing many IE downloads from my own site, it’s not been approved for CNET downloads yet, and isn’t in any other distribution channels.

How to keep adding features while you rebuild

Balancedstones

Brad Feld had a recent post about a classic scenario in software; you have an existing product that needs a complete strip-down and rebuild to meet new needs, but you can’t afford to stop releasing new features while the work’s being done. His advice is that you just need to grin and bear the pain until the rework is done. As an engineer, that’s definitely my preferred option. It’s much more efficient to keep the whole team focused on the rewriting.

Often though, engineering efficiency is beaten out by market, management and investor demands. So then what do you do?

The first time I ran across this was in games. Sports games demand annual releases. There’s no way that you can rebuild a whole rendering engine and art pipeline in a year, so what EA did was plan for an under-the-hood rewritten release every two years. In the off year they focused on adding things like championship simulations, updated graphics and new statistics. What amazed me was users were just as happy with the off-year releases!

A few years later I was working on an image-processing product. We had a very short time to ship a second release, and we were struggling because most of the features we looked at required big rewrites. For fun, I’d written a MIDI interface to the program, even though it wasn’t music-related. It took me a weekend to write it, everyone on the team considered it an amusing curiosity, but users loved it. In the end it turned out to be one of the most-talked about features of the release, and really helped the release.

So, what’s my advice? I try to follow the EA strategy, and identify engineering-light features that we can develop with minimal disruption to the rewrite. Try to keep an open mind and don’t automatically dismiss these as cosmetic. As long as they improve the experience of using your product, your customers will thank you!

Funhouse Photo User Count: 634 users, 51 active daily. I’m definitely seeing growth and active use tail off, and the obvious suspect is the async loading changes that cause the initial screen to take longer. I’m definitely hitting scaling problems myself here! I have some new effects in mind that I’d like to add, but it seems like that’s less urgent than fixing the loading.

How to add new effects to ImageMagick

Ax

ImageMagick isn’t focused on artistic effects, so I’ve had to port over some of my work from PetesPlugins to give Funhouse Photo a bit more flair. So far I’ve added distortion maps, erode and dilate operators, and scaling an image to the size of another image. I will eventually try and merge what I can into the main branch, but for now my current source code is up at http://funhousepicture.com/ImageMagick-6.3.5.tar.

The first hurdle is compiling the stock ImageMagick code base. You can either download the source as a tar, or grab it from subversion. I very strongly recommend you stick to Linux, I didn’t have much luck with either the Windows or OS X versions I attempted. The source didn’t contain the VisualMagick folder that I needed for Windows compilation, and once I’d found that in a special WIndows version of the source tree, it turned out that they don’t ship any workspace files, but instead require the compilation of a utility to produce the project files. Since this utility requires MFC, it won’t work with Visual Studio Express, so I had to abandon Windows.

My next shot was on OS X, and I was anticipating a smooth Unix ride. Unfortunately, getting the libjpeg, libtiff and libpng libraries set up quickly turned into yak-shaving. I was several levels of recursion deep (using fink to grab the dependent libraries failed to persuade configure to find their headers, building them myself on OS X turned out to need obscure hand-hacking of libpng’s make files, which required strange changes to the IM code, etc), that I decided to give up on building locally, and stick to building remotely on my server, despite the long turnaround time on compiler errors.

Building IM on my Red Hat Fedora Linux server was super smooth. I just grabbed the source, ran yum install libjpeg-devel, etc to install the libraries I needed, and I was building in no time.

The architecture is split up into several different layers. I’ll cover the changes I needed to make to put in the erode and dilate operators. The lowest level is the base magick directory, and effect.c holds most of the base effect methods. There’s two versions of each method, one of which applies to all color channels, and is a stub passing in an ‘all channels’ flag as the channel mask to the second workhorse processing function that does all the actual work.

Once you’ve got methods in there, you need to let the interface layers know. There’s a lot of different interfaces to the base processing layer, including scripting, C++, Perl and a direct GUI, but I’m mostly interested in the command-line tools. Here’s the places I found I needed to add references to any new function, to make it visible to convert:

  • magick/methods.h : Not sure about this one, but adding the declaration here didn’t hurt.
  • wand/convert.c : You need to add an if (LocalCompare(<command name>)… clause into the massive ConvertImageCommand() function here, to make sure the argument is accepted by convert. You don’t actually do any work here, apart from validating the arguments.
  • wand/mogrify.c : The actual image processing for convert is handled by mogrify, which focuses on dealing with a single image. You’ll need to add another string compare here in MogrifyImage(), and actually call the image processing commands on a match.

There’s other interfaces to IM that also need to be notified of any new operators, but since I don’t use them, and couldn’t test any changes, I’ve not looked into how to do that. You also need to do things a bit differently if you’re adding a composite operation, or an image sequence command, but I’ll cover those in a future article. I’ll also be asking the IM folks to look over this all, since I’m new to the code base.

Funhouse Photo User Count: 401 total, 116 active. Another steady day of growth, with 30% of the users active. I also had a sweet review from JasMine olMO, "i love this!! this is like the best mood app out there"!

ImageMagick review

Imlogo

I’ve known about ImageMagick for a while, and I’ve been looking for a chance to try it out. If you haven’t run across it, it’s a command-line image processing tool, and having finally played with it for a facebook app, I’m very impressed!

It’s installed everywhere

I was very pleasantly surprised to find that my dreamhost account already had version 6.2 pre-installed. Even more surprising, my old and usually-sparse WebHSP server does too! This meant no messing around trying to build it from source, or install the binaries and deal with dependencies. The only disappointment was finding that it wasn’t built in to OS X on my home machine, though there are binaries available of course.

It’s heavily used

The framework is in its sixth version, and it’s obviously being used by a lot of people. This gives it two big advantages; there’s been a lot of testing to catch bugs, and there’s some great documentation available. I particularly love the usage documentation. It’s bursting at the seams with examples solving practical problems, and that’s the way I learn best.

It’s both elegant and full of features

It must have been a tricky balancing act to get through six versions, with all the change in people and requirements, and keep the conventions for all the commands consistent, whilst constantly adding more. The way the image flow is specified is necessarily pretty hard to visualize, since it’s inherently a tree structure that’s been compressed into a line of text, but since all the operations work with the flow in the same way, it’s possible to figure out what’s going on without referring to the documentation on specific options.
The features include a wide and useful range of built-in filters, and a variety of expansion mechanisms. The only thing I missed was a filter plugin SDK, it would be nice to add to the current set using third-party operations.

It’s fast

I haven’t measured performance quantitatively, but I’ve been experimenting with doing heavy operations on large arrays of images across the web, and I haven’t seen any lag yet.

As an example, here’s how to call ImageMagick to generate a thumbnail through PHP. I stole this from the excellent dreamhost support wiki, and you’ll need to make sure you can exec() from your version of PHP if you’re on another hosting provider.

<?php
       $location='/usr/bin/convert';
       $command='-thumbnail 150';
       $name='glass.';
       $extfrm='jpg';
       $extto='png';
       $output="{$name}{$extto}";
       $convert=$location . ' ' .$command . ' ' . $name . $extfrm . ' ' . $name . $extto;
       exec ($convert);
       print "<img src=" . $output . ">";
?>

BHO Examples

Blueprint
One of the problems I ran into when I started looking at converting PeteSearch from a Firefox plugin to an Internet Explorer extension was finding good example code. There are a couple of great articles on MSDN that cover how to write a BHO, but neither include complete source code. Here’s the list of all the code samples I managed to find, and a couple I wrote myself:

A very simple skeleton BHO that I wrote:
http://petewarden.typepad.com/searchbrowser/2007/05/porting_firefox_1.html

A more complex example based on my Firefox add-on:
http://petewarden.typepad.com/searchbrowser/2007/06/porting-firefox.html

From Sven Groot, the source to his FindAsYouType extension:
http://www.ookii.org/software/findasyoutype/
http://www.ookii.org/download.ashx?id=FindAsYouTypeSrc

The SurfHelper pop-up blocker from Xiaolin Zhang:
http://www.codeproject.com/shell/surfhelper.asp

A simple example covering all the BHO events from René Nyffenegger:
http://www.adp-gmbh.ch/win/com/bho.html

More posts on porting Firefox add-ons to IE

XMLHttpRequest in C++ on Windows Example

Swirly
In the first three parts I covered how to get a simple IE extension built. For PeteSearch I need to be able to fetch web pages, so the next step is to figure out how to do that in Internet Explorer.

Firefox lets extensions use XMLHTTPRequest objects to fetch pages. It’s quite lovely; well documented and tested since it’s the basis of most AJAX sites, and with an easy-to-use interface. The first thing I looked for was an IE equivalent.

There’s a COM interface called IXMLHTTPRequest that looked very promising, with almost the same interface, but it turned out to involve some very gnarly code to implement the asynchronous callback in C++. It was also tough to find a simple example that didn’t involve a lot of ATL and MFC cruft, and it involved using a pretty recent copy of the MSXML DLL, and there were multiple different versions. All-in-all, I ruled it out because it was just sucking up too much time, and I dreaded the maintenance involved in using something so complex.

There’s also the newer IWinHttpRequest object,  but that’s only available on XP, 2000 and NT4.0, and seems far enough off the beaten track that there’s no much non-MS documentation on it.

I finally settled on a really old and simple library, WinINet. It’s a C-style API, and lower-level than XMLHttpRequest, and is a bit old-fashioned with some situations that require CPU polling, but it offers a full set of HTTP handling functions. It’s also been around since 1996, so it’s everywhere, and there’s lots of examples out on the web. Since I liked the XMLHttpRequest interface, I decided to write my own C++ class implementing the same methods using WinINet under the hood.

Here’s the code I came up with
. It implements a class called CPeteHttpRequest that has the classic XMLHttpRequest interface, with a simple callback API for async access. I’m making it freely available for any commercial or non-commercial use, and I’ll cover my experiences using it with PeteSearch in a later article.

Edit – It turns out that WinInet is actually very prone to crashing when used heavily in a multi-threaded app. You should use my WinHttp based version of this class instead.

More posts on porting Firefox add-ons to IE

Porting Firefox Extensions to Internet Explorer Part 3

Text
In the first two parts I covered setting up our tools, and creating a code template to build on. The next big challenge is replacing some of the built-in facilities of Javascript, like string handling, regular expressions and DOM manipulation, with C++ substitutes.

For string handling, watch out for the encoding! Most code examples use 8 bit ASCII strings, but Firefox supports Unicode strings, which allow a lot more languages to be represented. If we want a wide audience for our extension, we’ll need to support them too.

C++ inherits C’s built-in strings, as either char (for ASCII )or wchar_t (for Unicode) pointers. These are pretty old-fashioned and clunky to use, doing common operations like appending two strings involves explicit function calls, and you have to manually manage the memory allocated for them.

We should use the STL’s string class, std::wstring, instead. This is the Unicode version of std::string, and supports all the same operations, including append just by doing "+".  The equivalent for indexOf() is find(), which returns std::wstring::npos rather than -1 if the substring is not found. lastIndexOf() is similarly matched by find_last_of(). The substring() method is closely matched by the substr() call, but beware, the second argument is the length of the substring you want, not the index of the last character as in JS!

For regular expressions, our best bet is the Boost Regex library. You’ll need to download and install boost to use it but luckily the windows installer is very painless. Once that’s done, we can use the boost::wregex object to do Unicode regular expression work (the boost::regex only handles ASCII). One pain dealing with REs in C++ is that you have to use double slashes in the string literals you set them up with, so that to get the expression \?, you need a literal "\\?", since the compiler otherwise treats the slash as the start of a C character escape. The regular expressions functions themselves are a bit different than Javascript’s; regex_match() only returns true if the whole string matches the RE, and regex_search() is the one to use for repeated searches.

DOM maniplation is possible through the MSHTML collection of interfaces. IHTMLDocument3 is a good start, it supports a lot of familiar functions such as getElementsByTagName and getElementById. It does involve a lot of COM query-interfacing to work with the DOM, so I’d recommend using ATL pointers to handle some of the house-keeping with reference counts and casting.

PeteSearch is now detecting search page loads, and extracting the search terms and links from the document, next we’ll look at XMLHttpRequest-style loading from within a BHO.

More posts on porting Firefox add-ons to IE