February 9th, 2005

I was trawling through my web stats today, and I noticed a new user-agent string that I had never seen before. It simply described itself as “grub crawler”

I decided to Google for it, as most other crawlers will provide a link back to a site where it describes the operation of the crawler; except this one didn’t.

I came up with a lot of information about this. There are some people who believe it just to be a harmless crawler, and there are others who believe it to be a comment-spammer trawling for victims.

I’m not too convinced about the last one, but people are entitled to their opinions. Once you read the official description of “Grub” then you can see that this could just be a Grub user.

The idea of Grub, hosted at http://www.grub.org, is to be a distributed web trawling system, designed to keep tabs on every URL, every day. Kind of like SETI@Home, and other distributed processing projects.

However, the people over at Slashdot have other ideas, and have pointed out some of the bad points about Grub. Such as, Grub.org may have started as an open source project, but it is now a commercially owned entity, so why donate your bandwith to a “for-profit” company with no form of return for you donation? And what happens if your client is given the task of scanning a URL for child pornography? Certain police forces (e.g. Scotland yard, FBI, etc) would be alerted and (in this instance) it wasn’t your fault.

So, my take on the project is this:
It started out as a good idea; but as soon as it became part of a commercial company, it became unworthy of my bandwidth; and the ramifications of having to unwittingly scan for items that are illegal in your country may the project too damn dangerous to consider running

In short: Don’t use it. It you have it installed already, then uninstall it.

