Search the web
Sign In
New User? Sign Up
pavuk · Pavuk Webgrabber Mailing List
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Possible BUG in pavuk 0.9.31   Message List  
Reply | Forward Message #794 of 988 |
Hello group!
I have faced with strange behaviour of pavuk and I think that this is a bug.
I use pavuk 0.9.31
lomov@theor:~$ pavuk --version
pavuk 0.9.31 2005-01-14T13:56
Optional features available :
- Debug mode
- GNU gettext internationalization of messages
- flock() document locking
- HTTP and FTP over SSL
- SSL layer implemneted with OpenSSL library
- optional regex patterns in -fnrules and -*rpattern options
- POSIX regexp
- support for detecting whether pavuk is running as background job
- multithreading support
- NTLM authorization support
- IPv6 support
I have downloaded site http://www.linuxtopia.org and particularly
Perl_Programming subdirectory. This subdirectory contains file
pickingUpPerl_[0-9]*.html. But when I tried to see local copy of these
files they do not displayed correctly. I opened some files in text
editor and found that begining with some line the markup symbols (<,>)
are removed.
This were strange for me because I have seen this file in Internet and
it have been fine.
Later I download this site with the help of wget and all files are fine.
I suppose that pavuk incorrently render the following markup (borrowed
from file pickingUpPerl_20.html)

...
(line 81) </td>
(line 82) <td style="bodycolwidth"; vertical-align: top>
(line 83)
(line 84)
(line 85)
(line 86) <BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF"
VLINK="#800080" ALINK="#FF0000">
...

(This file was downloaded with the help of wget)
Corresponding lines of the file downloaded by pavuk
...
(line 81) </td>
(line 82) <td style="bodycolwidth"; vertical-align: top
(line 83)
(line 84)
(line 85)
(line 86) BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF"
VLINK="#800080" ALINK="#FF0000"
...

It is obvious that there is the syntax error in the original file
(site's file). For some reason pavuk strip all following markup symbols.



P.S. Sorry, if my English is poor.
---
WBR, Vladimir Lomov





Fri Sep 30, 2005 12:39 am

lomov@...
Send Email Send Email

Forward
Message #794 of 988 |
Expand Messages Author Sort by Date

Hello group! I have faced with strange behaviour of pavuk and I think that this is a bug. I use pavuk 0.9.31 lomov@theor:~$ pavuk --version pavuk 0.9.31...
Vladimir Lomov
lomov@...
Send Email
Sep 30, 2005
7:05 am

Hello, ... [...] ... This bug is fixed in 0.9.33. Ciao --...
Dirk Stoecker
stoeckerd
Offline Send Email
Sep 30, 2005
7:46 am

... Thanks! We will use the pavuk of new (0.9.33) release. P.S. Sorry, if my English is poor. ... wbw, Vladmir Lomov...
Vladimir Lomov
lomov@...
Send Email
Oct 1, 2005
11:29 am
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help