Queuing up commands in unix

Queue

I’m trying to lay down some decent foundations for processing images for a few thousand users from a single server. Because of the thrashing problem, one of the building blocks has to be a single queue for the CPU intensive image processing commands. Thrashing is when you run out of physical RAM, and you end up spending the majority of your time swapping virtual memory blocks on and off disk. This usually happens when there’s multiple memory-hungry processes competing, and as they’re time-sliced, they repeatedly push each other’s memory out of RAM, and pull in their own back. The result is that the actually take a lot longer to all finish than if they’d each run in sequence.

Well, this is an old, old problem, so I was hopeful that there’d be a tried and trusted unix mechanism to deal with it. The simplest way to avoid thrashing for my case is to make sure such operations do run one at a time, by keeping a single queue, and only running the next in the queue after the previous one is complete.

After some digging, I did discover batch, and it looked very promising. I ran into a few wrinkles. It expected a script file on disk as an input, rather than the dynamic commands I’d be feeding it, but I was able to avoid that by piping the output of an echo "<command to execute>" into the batch command. More serious was error reporting; it will sendmail the results of batch executions, but there doesn’t seem to be any other way of accessing that data. My simple tests worked, judging by the results I saw on disk, but some of the more complex ones failed, and I haven’t been able to work out why yet, because none of my server accounts have a mail set up.

Once I’ve got that working, my plan is to put all image processing commands into the queue, rather than executing them synchronously with the PHP page generation as I am at the moment. Then I’ll need to have a timeout call in the HTML itself, to refresh the page frequently until all the images have been completed. With all that in place, I should be able to show users some evidence that things are happening even when the server’s under a heavy load, rather than the blank page they see at the moment.

Funhouse Photo User Count: 268, active 78. Good to see it still climbing, and I’m not losing too many people. The active count is down, but that’s still around 40% of the users at the time the count was taken, so a good base to build on.

Engagement changes to Facebook stats

Lime

Facebook has just introduced some new default statistics for applications. Previously they would just show the total number of users that had installed the application, now they’re trying to measure how many people are actually using the app every day. This is an interesting change, since as they say you start off trying to measure what you value, and end up valuing what you measure. For Facebook apps, measuring the total number of users favored those that were shallow, but spread widely, whereas this new measure is designed to capture how much users actually engage with an app.

The measure is shown as a percentage of your users who actually take an action that interacts with your app every day. There are various actions captured, but they all involve the user actually doing something with your app.

The first day’s figures were very kind to Funhouse Photo, showing 86% of the users as active the previous day. I’d expect this to drop a bit as some of the novelty wears off, but I’m planning on regularly adding new effects in the hope this will keep people playing around with it, and sending photos to their friends. I’m also thinking about showing every photo change in your feed; I was worried that was a bit spammy, but I’d like to experiment to see if it is actually useful.

Funhouse Photo User Count: 200. Disappointing to be honest. It’s still climbing, but with a fairly shallow slope. I shall see be able to see if new effects and some more visible notifications give it a boost.

The new server is up and running

Server
I’ve now got the new server up and running at http://funhousepicture.com/. I’m holding off on moving the app from dreamhost until the DNS has had a good chance to propagate, but I’ve got a test version running successfully.

It was a really interesting experience getting the server running. cari.net is definitely set up for people wanting to resell server space, rather than my simpler needs. They also assume a good knowledge of linux server administration, which is a reasonable expectation, but one which I struggled to meet! Luckily google came to my rescue and I was able to struggle through

  • Setting up ssh for multiple users
  • Installing and building ImageMagick and its dependencies
  • Removing the virtual host bumpf that Plesk had set up for me
  • Sorting out the file permissions so the apache process could actually work with my non-root user
  • Removing auto_index from apache, since I really don’t want directory listings exposed to the world by default!

I was able to get that all done early this morning, and this evening, so the move should be possible tomorrow.

Funhouse Photo User Count: 164. It’s gone up more slowly than yesterday, but I am seeing plenty of people coming in from notifications their friends have sent them, which is promising.

Server overload!

Systembusy

Funhouse Photo appeared on Facebook’s directory a few hours ago, and it’s up to 106 users already. The bad news is that I’m starting to see occasional timeout messages when I’m testing it. Looking at the CPU usage on my server using the unix command ‘top’, I see spikes that seem to be caused by the heavy image processing demands of creating the photos. I’m using a shared server on dreamhost and I was hoping to avoid upgrading until I had more users. Unfortunately it looks like this is the point where I need to switch to keep things reliable.

The next step up is to get a dedicated machine, rather than sharing the CPU with other users. I’ve been very happy with dreamhost, but their dedicated server packages start at $400 a month. That’s a bit much, so I’ve gone with a mid-range package from cari.net instead, at $135 setup and $135 monthly.

I will be very interested to see how this works now, and scales as usage increases. I may need to rewrite the image generation to make it more asynchronous, so that the CPU load is smoothed out rather than spiky. One way to do this would be to only create a few images at a time, and progressively load each page, but I’ll think on that some more…

Funhouse Photo User Count: 106, after today’s inclusion in the directory.

Heart Attack Hill

Picture_115

After all the coding yesterday, me and Liz spent the evening on a bike ride in the Simi hills. We started off at the south end of First Street and went exploring on a different route than usual. We were lucky enough to see the ocean from the top of one of the hills, that’s the gleam at the bottom of the picture.

There’s no good map available of the loop we normally take, so I’ve experimented with Google maps. Here’s a link to the one I created, and I’ve embedded it below.

View Larger Map

I did manage to achieve a first yesterday when I made it all the way up ‘heart attack hill’ without stopping. This is a really steep section of fire road that I’ve always made it 50 or 75% up before having to walk, but it looks like all the lunchtime riding I’ve been doing in Santa Monica, including a hill-climb up Temescal Canyon, paid off!

Facebook development tips

Starflower

Here’s some techniques I picked up during Funhouse Photo’s development.

Create a dummy user for testing notifications

Unless you’ve got a trusting friend willing to give you access to their account, you’ll need a second account for yourself to make sure that your notification and request code is working. This made me a bit uncomfortable, since one of the things I like about Facebook is how closely accounts correspond with real people. I did try to get by without using a second account for a while, and gave up when I realized I was probably starting to spam my friends, since it was sometimes unclear if a message had actually been sent.

Have a second copy of your app to use as a sandbox

I’m sure this is standard web app development practice, but as a client programmer, it wasn’t immediately obvious. Create a second app in Facebook (Funhouse Photo Testbed in my case) and use that for all your development work. When you’ve got a new version stable, change the config and copy it over the old version, so users see an instant change from one version to another.

Enable debug output from the API

Add this line to the top of your PHP files, before you include facebook.php

$GLOBALS['facebook_config']['debug']=true;

You’ll now see output for every facebook call you make. I wasn’t able to twirl open the categories, so I also edited call_method() in facebookapi_php5_restlib.php to have their styles default to visible.

Use GET rather than POST for passing data

I find it a lot easier to check the address bar to understand what data’s being passed than find ways to dig into the POST data (eg using LiveHttpHeaders). This is against the standard meaning of GET, since it’s not supposed to have any side-effects, but I’ve left it active in the released version anyway, since I enjoy life on the edge. It’s probably preferable to switch to POST for releases, and I’ll be looking at that in the future.

GoogleHotKeys version 1.01 released

Sunrise

I’ve just uploaded the latest version of GoogleHotKeys for both IE and Firefox. The main site links to the addons.mozilla.org site for Firefox, and that may take a day or two to be updated. You can download it directly here until then. Changes include:

  • Pressing N takes you to the next page of search results
  • I’ve disabled the arrow keys from moving you between highlighting terms, since that sometimes was unhelpful
  • Fixed a few assorted bugs, such as the IE version forgetting which link was selected when you returned to a results page, and FF not correctly ignoring the Desktop search link in results pages.

It went very smoothly, apart from the final step of persuading WIX to create an upgrade installer for the IE addon. I assumed that this would just involve updating the version number, but it turned out to be a bit of a rabbit hole. I ended up cheating, and changing the installer GUID, which will result in some duplicate files on disk for upgraders, and a duplicate entry in add/remove programs, but seems to work.

Kicked in the nuts by Facebook

Down

I had a really low moment yesterday. I’d just finished the request-sending part of Funhouse Photo, and it was working like a dream. The request contained a picture of the recipient that the sender had put through Funhouse, along with a short message. I sent a few test messages, and everything looked good.

Then, it stopped working. Instead of the app-supplied image, the request now showed the plain photo of the sender. This completely killed the effectiveness of the request! Before it was really compelling because you actually saw a photo of yourself run through the software right there in the message, now it was unclear what Funhouse Photo actually offered.

I discovered that Facebook had decided to start ignoring the app-supplied image. They didn’t explain why in their announcement, but I can only guess there were some security concerns, though I’m having a hard time picturing the problems. This is why I hate, hate, hate having platform dependencies, especially on something that is so new and fluid. They can make a decision that has a devastating impact on your product, and you’ve got no way out but trying to work around it.

Anyway, I sucked it up, and completed an alternative implementation that uses notifications instead of app requests. There’s still no photo visible until you click, so it’s definitely not as good as my original setup, but should still provide a fun service for my users.

Talking of users, I’m now up to a grand total of 7! The app is still waiting in the queue for the directory, but the interesting thing is that two of my users aren’t in my friends network, which means they must have spotted it on one of my friends’ profiles and decided to add it. That’s very encouraging.

Security hole in Facebook Footprints example

Blackhat
As I was working on Funhouse Photo last night, I ran into a bug where a database insertion operation was failing. I realized that this was because I had copied the Footprints example, and created the query using string concatenation, eg:

$res = mysql_query('SELECT `from`, `to`, `time` FROM footprints WHERE `to`=' . $user . ' ORDER BY `time` DESC', $conn);

In my case, it was an internally generated string that was causing the problem, because it contained an unescaped single quote. I realized that I was also passing in strings I’d received from the user via POST. This means that someone could easily run an SQL injection attack, and get full access to my database!

An injection attack works by passing a user input string that contains a quote and semi-colon to prematurely finish the query command the host thinks they’re running. The rest of the string contains some other arbitrary command that the attacker wants to run. In the query above, if $user was actually passed as

12345'; DROP TABLE footprints; --

then the complete query would read

SELECT `from`, `to`, `time` FROM footprints WHERE `to`='12345'; DROP TABLE footprints; --'ORDER BY `time` DESC

In this example a table’s deleted, but you could insert almost any SQL statement.

(Edit – PatMan pointed out in the comments that mysql actually has a nice security feature that only allows a single statement in a PHP mysql execute. This is reassuring, it definitely makes it harder to do any damage, though it still leaves the door open to plenty of data-access exploits)

There are two main ways to avoid this. In PHP4 and earlier, magic quotes is turned on by default, which means all posted strings have their quotes auto-magically prefixed with backslashes. This turned out to be a dodgy solution for a lot of reasons, and it was turned off by default in PHP5, which is what I’m running.

A better fix, and the one I ended up using, is to call mysql_real_escape_string() on any strings before you concatenate them into a query. You could also use something other than concatenation to create your query.

Now, I’m a PHP/mysql newbie, but it seems like a pretty Bad Thing that Facebook’s example app contains this security hole. Newbies like me are also the ones least likely to already know that it’s a security problem, and so will copy it blindly into their own apps.

Hopefully the guys over at Facebook will be able to update the example, before it infects too many other apps, and gives hackers access to all the personal data that’s being stored. I have posted on the developer discussion board, but it’s unclear if there’s any other way to alert the team.

Funhouse Photo User Count : 5. I worked on the ‘send a photo to a friend’ feature last night, that’s still not working (a story for another post) so I’ve held off publicizing it. It is in the queue for the directory now though.

Facebook app statistics

Chart

One of the things I really like about having an app on Facebook is that I can easily see how many people are using it. This may sound trivial, but for my Firefox and IE extensions, I’m stuck trying to estimate usage based on downloads. Now I’m on the main Firefox add-on site, I can get stats on downloads from there, but they’re a bit opaque:
Firefoxstats

It’s good to see I’m approaching 4000 downloads in the past few weeks, but until recently that total had stayed at 867 for a long time. And that per-week total has always shown 1. So I’m not convinced they’re very up-to-date, though I’m pretty sure they aren’t over-counting downloads, just slow in showing them.

As an aside, these stats indicate that focusing my message really paid off. PeteSearch had only around 200 downloads after several weeks, on the same site.

For my IE plugin, I have to go off my web site logs. I’ve used some of the built-in traffic-monitoring tools, but they all seem geared towards overall site statistics rather than the particular measure I want: "How many people with agents you recognize (eg not robots) downloaded this file over a particular period of time? How many since the beginning of time, and were they repeat customers (ie upgrading)?"

What I end up doing is just looking at the raw latest visitors log, and getting a qualitative impression of downloads. What I see is that Firefox users outnumber IE by at least ten to one. There are several possible explanations for this:

  • My promotion’s been a lot more effective in the Firefox market.
  • People who are early-adopters interested in something like GoogleHotKeys are also more likely to have adopted Firefox.
  • IE users don’t have much of an add-on selection, so they’re unused to installing extensions, and are probably more wary of the security risks.

Even once you know the download totals, unless you build in an unpopular phoning-home capability, it’s hard to know how many people are actually using it. I have a help link added to every Google page that users are free to click on, so that is at least an existence proof for usage; if I see it showing up, I know somebody’s gone ahead and installed it.

Facebook gives both the developer and the rest of the world a simple and up-to-date view of how many people are using an app. This should help me understand what’s working and what isn’t, and learn from my customers very quickly. It’s also a good motivating score-card!

To provide some edutainment, I’ll include the current number of Funhouse Photo users in every post, along with a short explanation of anything that’s happened to explain them. Here’s the first report:

5 users – This includes my sister! I’ve sent out links to a few people to test it, but it’s not in the directory, and I haven’t tried to promote it in any other ways yet, since there’s still some bugs to iron out.