----- Original Message -----
From: "Noah" <noah@...>
: 1) I would like to fetch the list of sites to spider and the
various parameters (domains to
: skip, extentions, depth, etc.) from a databse instead of a
text file. Where in the code
: would it be best to set this up so that the apporpriate
internal variables are set for Pavuk?
:
: 2) I want to post process the html that I get. (No need to
save to disk), I imagine that at
: some point all the HTML is in a varialbe that I can easily
access and then pass to a few
: functions of my own.
Perhaps there's no need to modify the program at all, but
instead focus on using the existing features to their fullest
potential. You can dump the SQL query results to a file. Pavuk
will also save file descriptors for you, just for the purpose
of further processing with another tool.
: Anybody feel like helping out a newbie still learning his way
around C??
Nobody wants to help newbies, other than to encourage them to
keep trying. Get the latest distro (or better, CVS snapshot),
compile it and start tinkering with it. Use C reference manual,
grep, gdb, some syntax highlighting editor, and you can't go
wrong, except to run out of motivation.
Paul