I struggeled a little with this myself !
I have attached a little simple javaprogram that works with 1.2.0
You can launch a crawl with: java -Xmx128m dk.netarkivet.harvestcontroller.SimpleHeritrixLauncher <orderfile>
It waits for the crawler to finish by attaching a CrawlStatusListener !
Remember to include both heritrix.jar and the jars in $HERITRIX_HOME/lib in your CLASSPATH
best
Bjarne Andersen
www.netarchive.dk
stack wrote:
spielc wrote:
>
> Hi everybody!
>
> I'm trying to start Heritrix (Heritrix - version 1.2.0) from another
> Java-Application. As i don't need the Web-UI for this app i started
> (better said tried to start) it using the main-method of heritrix with
> two arguments: Heritrix.main(new
> String[]{"--nowui",orderFile.getAbsolutePath()}); orderFile is a
> java.io.File-Object pointing to the Order-File i want to run. Well it
> starts to crawl but it finishes too early (by looking at the report
> files i could see that just 2 files were crawled...). I ran Heritrix
> from command-line with --nowui and the same Order-File and it works
> alot longer and more correct in my eyes. The lil code fragment that's
> in the FAQ doesn't work neither as launch is a protected static
> method...
>
> I would be grateful for every assistance i can get!!
Do the logs tell you anything about why the crawl runs for a shorter
time? Paste in the crawl.log if its only two lines (Look in
local-errors and in STDOUT/STDERR for any exceptions). The crawler
should do the same thing in the two contexts.
Yours,
St.Ack
P.S. #launch access is changed in Heritrix HEAD.
>
>
>
>
> *Yahoo! Groups Sponsor*
> ADVERTISEMENT
> click here
> <http://us.ard.yahoo.com/SIG=1294div7n/M=294855.5468653.6549235.3001176/D=groups/S=1705004924:HM/EXP=1103036404/A=2455396/R=0/SIG=119u9qmi7/*http://smallbusiness.yahoo.com/domains/>
>
>
>
> ------------------------------------------------------------------------
> *Yahoo! Groups Links*
>
> * To visit your group on the web, go to:
> http://groups.yahoo.com/group/archive-crawler/
>
> * To unsubscribe from this group, send an email to:
> archive-crawler-unsubscribe@yahoogroups.com
> <mailto:archive-crawler-unsubscribe@yahoogroups.com?subject=Unsubscribe>
>
> * Your use of Yahoo! Groups is subject to the Yahoo! Terms of
> Service <http://docs.yahoo.com/info/terms/>.
>
>