Hi all,
First, thanks for joining the list. Prior to this all communication was done via
direct emails between myself and Stefan, but now the list should be a way to
capture development-related information.
Second, some quick status...
1. Currently in GitHub the release jars can be found in the /release
subdirectory.
2. We're at version 0.3.4, as of 5/16/2009.
3. I need to get more rigorous about using the git "tag" support, so that a
released jar has a specific set of code associated with it.
4. The biggest crawl done with Bixo to date is about 200K URLs, on an EC2
cluster of 5 servers. But this coming week I'll be running a 1M URL crawl. Note
that these crawls are using whitelisted URLs.