it's a bug. I fixed it at Netli before they fired me. I've got two sets of updates to merge with the mainline tree. I'm sorry I haven't been very responsive lately -- personal issues get in the way.
Marty
-----Original Message-----
From: engerenger [mailto:enger@...]
Sent: Sunday, December 28, 2003 11:45 AM
To: pavuk@yahoogroups.com
Subject: pavuk tries to download from "file:///<various-paths>"
When there is no existing copy of the downloaded files on my machine,
I note that Pavuk's 3rd (or so) download is an attempt to
download the current-working-directory as a URL from the remote machine.
Once a copy of the remote files exist on my local machine, then Pavuk
tries to download _._.HTML from the remote site.
In either case, the remote site doesn't have the requested item,
AND it will probably generate messages in the remote-sites's logs,
to their dismay.
Is there a command-line flag to stop this behavior?
Is it a bug?
If anyone has insight into this, I'd appreciate a private e-mail.
Thanks very much!
Please see the small snippets below.
Bob
WITH NO LOCAL FILES (see download item #3, ie: file:///home/enger ):
[enger@localhost tmp]$ pwd
/tmp
[enger@localhost tmp]$ time pavuk -cdir /tmp/work -singlepage
-norobots -nthreads 1 http://www.yahoo.com/
URL[ 1]: 1(0) of 1 http://www.yahoo.com/
download: OK
URL[ 1]: 2(0) of 16
http://us.i1.yimg.com/us.yimg.com/i/ww/m6v8y.gif
download: OK
URL[ 1]: 3(0) of 16 file:///tmp
download: ERROR: opening file
URL[ 1]: 4(1) of 16
http://us.i1.yimg.com/us.yimg.com/a/ho/hot_jobs/old54x16tran_tm.gif
download: OK
URL[ 1]: 5(1) of 16
http://us.i1.yimg.com/us.yimg.com/i/mntl/spo/03q3/hdr_nfl.gif
download: OK
[...]
NOW THAT THE LOCAL DIRECTORY HAS SOME CONTENTS, NOTE CHANGE IN #3:
[enger@localhost tmp]$ time pavuk -cdir /tmp/work -singlepage
-norobots -nthreads 1 http://www.yahoo.com/
URL[ 1]: 1(0) of 1 http://www.yahoo.com/
File redirect
download: OK
URL[ 1]: 2(0) of 16
http://us.i1.yimg.com/us.yimg.com/i/ww/m6v8y.gif
File redirect
download: OK
URL[ 1]: 3(0) of 16 http://www.yahoo.com/_._.html
download: ERROR: HTTP document not found
URL[ 1]: 4(1) of 16
http://us.i1.yimg.com/us.yimg.com/a/ho/hot_jobs/old54x16tran_tm.gif
File redirect
download: OK
URL[ 1]: 5(1) of 16
http://us.i1.yimg.com/us.yimg.com/i/mntl/spo/03q3/hdr_nfl.gif
File redirect
download: OK
URL[ 1]: 6(1) of 16
http://us.i1.yimg.com/us.yimg.com/i/mntl/spo/03q4/favre_hug.jpg
[...]
Yahoo! Groups Links
- To visit your group on the web, go to:
http://groups.yahoo.com/group/pavuk/
- To unsubscribe from this group, send an email to:
pavuk-unsubscribe@yahoogroups.com
- Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.