So far, we’ve created a server program that will accept a single client connection and then exit. The next step is to handle more than one client. Have a look at the final code that does this, and then I’ll explain how it works.
The most obvious way to handle multiple requests is to loop on accept() inside main. You could pull a client socket for every new connection and then having the client conversation inside the loop. The flaw with this plan is that transferring data back and forth between the server and the client might take a comparatively long time, and there might be other clients trying to connect who will be blocked until the first client is completely done.
What we need is a way to split up the work so we can have multiple conversations in progress at the same time. This is efficient because networks are a lot slower at transferring data than processors are, so in the gaps where the server is waiting for a client response there’s lots of CPU time to handle other connections, especially on multi-core machines.
There’s three main techniques for spliting up the work. Probably the most complex but also the most flexible is using non-blocking sockets and select() to run a single loop that pulls chunks of data from multiple connections. This pattern does a small amount of work for each chunk before looping around and dealing with another conversation in the next iteration. thttpd is an excellent example of this approach. The complexity comes because it takes careful design to make sure that the work you’re doing every time you deal with a chunk of data always runs quickly enough for other conversations to be dealt with responsively. It’s also hard to make this event-based model run well on more than one processor.
The other two approaches are a lot simpler to write. You can create a new process to deal with every client conversation, either using fork() or even explicitly calling a command line like inetd does. This has the advantage of being incredibly simple to write, but on some OS’s (though not Linux) creating a new process can be time-consuming and inefficient. For my purposes I also need to share resources between the conversations. since the goal of my server is to allow fast access to preloaded word frequency data. There are mechanisms to communicate between processes that would allow this, but the simplest way to do this is using threads.
Threads are similar to lightweight processes, but they share full access to all the same memory as the parent. This is both a blessing and a curse, thanks to all the variations that timing and resource-locking introduces, threading bugs can be incredibly time-consuming to track down and there can often be subtle bugs in even simple threading code.
The basic idea for threading the server is that we’ll create a new worker thread for every client connection. This thread will carry on the conversation, but it will actually be paused and control passed to a different thread every so often, so that both the main listener and other worker threads get a chance to do the work they need to. The beauty of the thread model is that the actual data transfer code looks purely procedural just like the original single connection version, there’s no alteration to the internal logic needed to deal with the multi-tasking.
The most common threading library is the Posix standard, though Windows has its own libraries for this. I’ve implemented the new server code using pthread, adding –lpthread to the compile flags to ensure the library is linked in. There’s a new function to deal with each client:
void* handleClientRequestThreadFunction(void* threadArgumentPointer)
threadArgumentStruct* threadArgument = (threadArgumentStruct*)(threadArgumentPointer);
const int transferSocketFile = threadArgument->_transferSocketFile;
The rest of the function is exactly the same data transfer code that was in the old server’s main loop. The main subtlety here is that we need to pass in the file descriptor for the connection, but thread functions always take a void pointer as input, not an integer as we need. Instead, we create a structure that holds all the data we want to pass in, since in the future we’ll need to pass more arguments. This structure can’t live on the stack as a local variable in the calling function, since the main thread may have moved out of that function by the time we get here. Instead, we create an area of memory in the heap using malloc to hold the data, and pass a pointer to this into the function, relying on the client thread to free it.
You can get away with simpler techniques for basic data types, such as casting an int directly to a void pointer and back, but these will throw up warnings when the size of those types doesn’t match on a 64 bit machine, so this is generally a cleaner approach at the cost of some extra heap allocations.
The main function now contains a loop that spawns a thread for every client connection, and then goes back to listening:
int transferSocketFile = accept(listenSocketFile,
(struct sockaddr *) &transferAddress,
// It's the thread's responsibility to free this memory
threadArgumentStruct* threadArgument = malloc(sizeof(threadArgumentStruct));
threadArgument->_transferSocketFile = transferSocketFile;
void* threadArgumentPointer = (void*)(threadArgument);
} while (1);
This loop does very little work other than setting up the data structure for the function arguments and starting off the thread. pthread_create() is the key function here, it’s kicking off the execution of the client request handling function. One key thing to understand is that once that’s called, you really don’t know when the function is getting executed, or when it’s done. It may even be happening at the same time as the main loop if you’re on a multi-processor machine.
This means we need to be very careful about making sure that different threads don’t step on each others toes by altering data that another thread might also be working with. I’ll cover how to handle that sort of synchronization next.