If you want to know which search terms are most likely to find your site, I’ve uploaded a PHP library that creates search clouds from your logs. To use it include searchcloud.php and call create_search_cloud(), passing in the location of your log file, the name of your site, the number of tags to produce and the min/max font sizes in percentages. You’ll be returned a string containing the HTML for the cloud. Here’s an example:
echo create_search_cloud("visitlogs_petewarden.txt", "petewarden.com", 50, 50, 250);
You can see it working on this example page based on statistics from my old open-source image processing site, which I’ve also included with the library for testing purposes.
Based on the examples I’ve tried, my hypothesis that the most frequent search terms are a good approximation for the meaning of the site holds up. If you take the top 8 terms from the petewarden.com cloud, you get "after effects", "plugins", "effects", "after", "how to", "install". "how to install", "petes plugins". 4 of them would be good tags or taxonomy categories for the content, and on inspection the use of more sophisticated rejection of duplicates and stop words would help increase that ratio. I’ll be interested to hear how this works on some of your sites.