As I work to make the data import process more reliable, one of the patterns that was recommended to me was having your import processes append their results to a log file, and then have a database load process that watches the file and updates the database as it spots new content.
This potentially solves a lot of problems for me, but it's not at all obvious how to implement the file tailing functionality in PHP, so I implemented a standalone example to test some approaches. Since I've always wished there was a better way than ssh-ing into my servers to view a live version of the Apache error logs, I made the example a long-running PHP process that outputs updates from the logs to your browser. You can see it running here:
http://web.mailana.com/labs/logviewer/
The code is downloadable as logviewer.zip, or it's included below. Be warned, before you use this on a production server make sure it's password-protected, since sensitive information like passwords or credit-card numbers might leak into your error messages inadvertently!
If you're looking for a Perl version, you might want to look at the File::Tail module.
<?php
//! This function never exits!
//! It sits on the specified file, calling the process function for each new line of
//! input as it's appended by some other process. The cookie value lets you pass in
//! an opaque object to be used by the process function callback.
function tail_and_process_file($filename, $processfunction, $cookie=null)
{
$retrycount = 0;
while ($retrycount<5)
{
if (file_exists($filename))
$retrycount = 0;
$filehandle = fopen($filename, 'r');
if (!$filehandle && file_exists($filename))
die("tail_and_process_file($filename, $processfunction, $cookie): The input file exists but I couldn't access it");
$lastaccesstime = 0;
$lastfilesize = 0;
$lastfileposition = 0;
while (file_exists($filename))
{
$currentaccesstime = fileatime($filename);
if ($currentaccesstime!=$lastaccesstime)
{
$currentfilesize = filesize($filename);
if ($currentfilesize<$lastfilesize)
{
$fclose($filehandle);
$filehandle = fopen($filename, 'r');
$lastfileposition = 0;
}
fseek($filehandle, $lastfileposition);
while (!feof($filehandle))
{
$currentline = fgets($filehandle);
if ($currentline!='')
$processfunction($currentline, $cookie);
}
$lastaccesstime = $currentaccesstime;
$lastfilesize = $currentfilesize;
$lastfileposition = ftell($filehandle);
}
// Without this, the results of the fileatime() may be cached
clearstatcache();
sleep(1);
}
// The file no longer exists, so wait a progressively longer interval
// and try it again. After too many retries, die with an error
$retrycount += 1;
sleep($retrycount*15);
}
die("tail_and_process_file($filename, $processfunction, $cookie): The input file was not present after multiple retries");
}
?>