Search the web
Sign In
New User? Sign Up
archive-crawler

Group Information

  • Members: 615
  • Category: Cyberculture
  • Founded: Dec 1, 2002
  • Language: English
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Best of Y! Groups

   Check them out and nominate your group.
Visit the Groups blog for the latest Yahoo! Groups information

Home

 

Activity within 7 days:

5 New Members - 18 New Messages - New Questions

Description

Discussion group for the Heritrix open-source archival web crawler project.

Most Recent Messages

  (View All)
(Group by Topic)
Advanced
   Start Topic
which Crawl Scope should I use?
I am using heritrix 1.14.3.. I just want use heritrix to crawl some wsdl document,but after several attempts£¬I found it too hard. my problem:
Posted - Wed Nov 25, 2009 10:07 am
zhongkem@...
zhongkem...
Offline Offline
Send Email Send Email
Exporting Edited Heritrix 2
Hi all, This is not a "real" Heritrix issue but maybe somebody has an answer to this: I programmed 2 processors for Heritrix 2 and put my classes in some
Posted - Tue Nov 24, 2009 11:16 pm
sendaman69
Offline Offline
Send Email Send Email
Re: (subject edited) Recrawling In Heritrix3.0.0-RC1
I see two potential issues in your order: - There's no FetchHistoryProcessor, which is still necessary to collect deduplication-relevant information and insert
Posted - Tue Nov 24, 2009 7:48 pm
Gordon Mohr
gojomo
Offline Offline
Send Email Send Email
Re: (subject edited) Recrawling In Heritrix3.0.0-RC1
sorry about the broken link sent before. http://cs.odu.edu/~pramo_p/crawler-beans.cxml ... From: Pranay Pandey <sspranay@...> Subject: [archive-crawler]
Posted - Tue Nov 24, 2009 3:44 pm
Pranay Pandey
sspranay
Offline Offline
Send Email Send Email
Re: (subject edited) Recrawling In Heritrix3.0.0-RC1
Hello Matt and Gordon, Following Gordon's advice and assuming HER-1706 to be fixed, I am using only two of the persistProcessors: load and store. As suggested,
Posted - Tue Nov 24, 2009 3:36 pm
Pranay Pandey
sspranay
Offline Offline
Send Email Send Email
Add archive-crawler to your personalized My Yahoo! page Add to My Yahoo! XML What's This?

Message History

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2009 30 51 42 72 51 38 44 54 62 68 36
2008 72 80 60 72 90 89 39 56 64 63 29 33
2007 132 87 140 213 71 118 86 52 41 70 102 129
2006 126 113 46 54 70 104 140 86 152 119 78 64
2005 138 177 81 62 127 114 46 88 71 76 85 106
2004 56 3 20 62 135 63 168 204 130 72 97 82
2003 14 18 20 15 25 41 14 2 9 30 33
2002 1
What is Yahoo! Answers?

Yahoo! Answers, a new Yahoo! community, is a question and answer exchange where the world gathers to share what they know...and make each other's day. People can ask questions on any topic, and help others out by answering their questions.

What is Yahoo! Answers?

Yahoo! Answers, a new Yahoo! community, is a question and answer exchange where the world gathers to share what they know...and make each other's day. People can ask questions on any topic, and help others out by answering their questions.

Questions in Computers & Internet

  • Questions are currently unavailable.

Want to help answer other questions? Go to Yahoo! Answers

Group Email Addresses


Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help