Hi *, ... http://www.ifi.lmu.de/~schoefma/howto/run_heritrix_on_windows/heritrix.cmd ...
Maximilian Schoefmann
schoefma@...
Oct 5, 2006 7:57 am
3382
Hi I am using Heritrix1.10.1 and sun Java 1.5.0_06 under window xp operating system. I run the heritrix by c:\Heritrix1.10.1\bin>heritrix then using WUI, I...
Hi, ... I've just discovered that the order in which the jar files are loaded was wrong in the current heritrix script for windows. That's fixed in the one I...
Maximilian Schoefmann
schoefma@...
Oct 5, 2006 11:41 am
3385
Hi I am working on window xp platform. I have sun java1.5.06. I download heritrix src. To build it it download the maven1.0.2. Folder structure is like this ...
... loaded was ... the one ... your ... Hi I download the updated heritrix script for window and used that. But still I am getting same problem. This show...
Hi, ... Did you also copy the "profiles" directory from the heritrix-1.10.1.jar to HERITRIX_HOME\conf ? See:...
Maximilian Schoefmann
schoefma@...
Oct 5, 2006 1:30 pm
3388
Just a note of thanks to the all who work on Heritrix. I found it easy to get up and crawling. When I did have a problem found a post here in the yahoo group....
Just wanted to point everyone to this article about become.com's crawler they implemented in Java. Perhaps some interesting comparisons between heritrix' and...
Hi! Is possbible to execute 2 or more jobs on heritrix 1.10.0 at the same time? If it is true, how could I do this? Thanks everyone! Guilherme - Brazil...
l like the idea of a Non-monlithic architecture, Itll be great to attempt a refactor of Heritrix for Distributed Crawls. ... -- Its fun being a realist.... ...
... 1.10.1.jar to ... crawler/message/2085 ... of ... Max Hi, Thanks for solving "FetalInitializationException". I extracted the "profile" folder from the...
Greetings, I just came in touch with heritrix and found interest to run it. i'm working on WindowsXP platform. and downloaded heritrix-1.10.1 version. istalled...
Hi, I noticed on Windows Server 2003 that the owner of my files wsa set to the group "Administrators" instead of my own user. Java will then caugh on the JMX...
Maximilian Schoefmann
schoefma@...
Oct 6, 2006 9:43 am
3395
Hi Guilherme, ... This seems to work in the current Heritrix. In the Web UI, click on "Setup" and browse to "Local instances". You can then create a new...
Maximilian Schoefmann
schoefma@...
Oct 6, 2006 10:00 am
3396
Hi again, ... It really seems like the jars are still loaded in the wrong order. Please update the script again, you seem to have downloaded it before I...
Maximilian Schoefmann
schoefma@...
Oct 6, 2006 10:27 am
3397
Hi I am using Heritrix1.10.1 and sun Java 1.5.0_06 under window xp platform. I run the heritrix by c:\Heritrix1.10.1\bin>heritrix then using WUI, I create a...
... Please ... noticed ... Hi Max, What this operator.journal file is. And why this is need. The exception "StartNextJob" java.lang.NoSuchMethodError is...
... Please ... noticed ... Hi Max, What this operator.journal file is. And why this is need. The exception "StartNextJob" java.lang.NoSuchMethodError is...
... You can use the journal to take personal notes during the crawl. It's normally not needed (unless you take notes) and is just a plain text file. The...
Maximilian Schoefmann
schoefma@...
Oct 6, 2006 11:08 am
3401
... It's ... text file. ... thrown ... operation). ... the Web UI. ... Hi Max Thanks for your reply. Ok, now I am presenting Alerts-section of wui. I am...
... I personally don't care too much about Heritrix warnings anymore as long as my crawl crawls :-) I'm no expert here and don't know what can trigger this...
Maximilian Schoefmann
schoefma@...
Oct 6, 2006 11:57 am
3403
I wonder if become.com will ever release any of the source code publicly? I doubt it since that would be helping their competitors. You'd think they'd at...
Hi I am using Heritrix1.10.1 and sun Java 1.5.0_06 under window xp operating system. When I created a job, and start crawler to crawl the job. I saw job status...
Hi, I using Heritrix1.10.1 with sun java1.5.06 under window xp platform. I want to run heritrix from command prompt. I run the Heritrix using command prompt....
... I've not seen this one before. Heritrix wants to run a stylesheet (arcMetaheaderBody.xsl) against the xml order file to extract attributes such as...