When there is no existing copy of the downloaded files on my machine,
I note that Pavuk's 3rd (or so) download is an attempt to
download the current-working-directory as a URL from the remote machine.
Once a copy of the remote files exist on my local machine, then Pavuk
tries to download _._.HTML from the remote site.
In either case, the remote site doesn't have the requested item,
AND it will probably generate messages in the remote-sites's logs,
to their dismay.
Is there a command-line flag to stop this behavior?
Is it a bug?
If anyone has insight into this, I'd appreciate a private e-mail.
Thanks very much!
Please see the small snippets below.
Bob
WITH NO LOCAL FILES (see download item #3, ie:
file:///home/enger ):
[enger@localhost tmp]$ pwd
/tmp
[enger@localhost tmp]$ time pavuk -cdir /tmp/work -singlepage
-norobots -nthreads 1
http://www.yahoo.com/
URL[ 1]: 1(0) of 1
http://www.yahoo.com/
download: OK
URL[ 1]: 2(0) of 16
http://us.i1.yimg.com/us.yimg.com/i/ww/m6v8y.gif
download: OK
URL[ 1]: 3(0) of 16
file:///tmp
download: ERROR: opening file
URL[ 1]: 4(1) of 16
http://us.i1.yimg.com/us.yimg.com/a/ho/hot_jobs/old54x16tran_tm.gif
download: OK
URL[ 1]: 5(1) of 16
http://us.i1.yimg.com/us.yimg.com/i/mntl/spo/03q3/hdr_nfl.gif
download: OK
[...]
NOW THAT THE LOCAL DIRECTORY HAS SOME CONTENTS, NOTE CHANGE IN #3:
[enger@localhost tmp]$ time pavuk -cdir /tmp/work -singlepage
-norobots -nthreads 1
http://www.yahoo.com/
URL[ 1]: 1(0) of 1
http://www.yahoo.com/
File redirect
download: OK
URL[ 1]: 2(0) of 16
http://us.i1.yimg.com/us.yimg.com/i/ww/m6v8y.gif
File redirect
download: OK
URL[ 1]: 3(0) of 16
http://www.yahoo.com/_._.html
download: ERROR: HTTP document not found
URL[ 1]: 4(1) of 16
http://us.i1.yimg.com/us.yimg.com/a/ho/hot_jobs/old54x16tran_tm.gif
File redirect
download: OK
URL[ 1]: 5(1) of 16
http://us.i1.yimg.com/us.yimg.com/i/mntl/spo/03q3/hdr_nfl.gif
File redirect
download: OK
URL[ 1]: 6(1) of 16
http://us.i1.yimg.com/us.yimg.com/i/mntl/spo/03q4/favre_hug.jpg
[...]