Search the web
Sign In
New User? Sign Up
pavuk · Pavuk Webgrabber Mailing List
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Modifying Pavuk   Message List  
Reply | Forward Message #753 of 988 |
I have a question for the group (or the Pavuk authors.)

I am considering using Pavuk for a project that would require a bit of
modification on how
it runs. Having only basic knowledge of C, the source code is a bit complex.
If someone
could point me to a few things, I should be able to take it from there.

I want to spider a large list of websites (1000+) and then process the content.

1) I would like to fetch the list of sites to spider and the various parameters
(domains to
skip, extentions, depth, etc.) from a databse instead of a text file. Where in
the code
would it be best to set this up so that the apporpriate internal variables are
set for Pavuk?

2) I want to post process the html that I get. (No need to save to disk), I
imagine that at
some point all the HTML is in a varialbe that I can easily access and then pass
to a few
functions of my own.

Anybody feel like helping out a newbie still learning his way around C??

Thanks!!

-N





Thu Sep 9, 2004 10:25 pm

noah977
Offline Offline
Send Email Send Email

Forward
Message #753 of 988 |
Expand Messages Author Sort by Date

I have a question for the group (or the Pavuk authors.) I am considering using Pavuk for a project that would require a bit of modification on how it runs....
Noah
noah977
Offline Send Email
Sep 14, 2004
2:16 pm

... From: "Noah" <noah@...> ... various parameters (domains to ... text file. Where in the code ... internal variables are set for Pavuk? ... save...
Paul Slusarz
wiedzmin
Offline Send Email
Sep 16, 2004
3:08 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help