Search the web
Sign In
New User? Sign Up
pavuk · Pavuk Webgrabber Mailing List
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
URL Translation problem?   Message List  
Reply | Forward Message #771 of 988 |
Re: URL Translation problem?


----- Original Message -----
From: "Michal TOMA" <michaltoma@...>

: It seems srrange I agree, but the fact is that in the page :
:
http://www.notzai.info/modules.php?name=Downloads&d_op=viewdownload&cid=2
: those URLs are evrywhere...

I just looked at RFC 1738 and it only allows for % encoding.

: Could you direct me to the portion of code I need to check to
see whats
: going on?

Since this is specific to URLs, best place to do checking and
translation would be when a URL is being parsed. Look at
url_parse() in url.c. To make your fix generic enough, before
url_parse does anything, call a function like
url_translate_faulty_html_encoding which will change the urlstr
if there's an "&" followed by ";". Then submit as a patch
through pavuk's sourceforge system...

: What is strange is that it works in one case and not in
another one? The
: first "&amp;" is always translated right...

Yes, that's strange, but could just be a problem with logging.
If this translation happens before url_parse is called then you
can forget what I said above.

Paul







Wed Dec 8, 2004 5:12 pm

wiedzmin
Offline Offline
Send Email Send Email

Forward
Message #771 of 988 |
Expand Messages Author Sort by Date

Hello, I'm a new pavuk user, and I have a question about a strange URL translation problem. I'm downloading a site starting from the following URL ...
miso2136
Offline Send Email
Dec 7, 2004
3:49 pm

Hello, I'm a new pavuk user, and I have a question about a strange URL translation problem. I'm downloading a site starting from the following URL ...
Michal TOMA
michaltoma@...
Send Email
Dec 7, 2004
3:50 pm

... From: "Michal TOMA" <michaltoma@...> ... href="modules.php?name=Downloads&amp;d_op=getit&amp;lid=5"> ... ...
Paul Slusarz
wiedzmin
Offline Send Email
Dec 8, 2004
8:49 am

It seems srrange I agree, but the fact is that in the page : http://www.notzai.info/modules.php?name=Downloads&d_op=viewdownload&cid=2 those URLs are...
Michal TOMA
michaltoma@...
Send Email
Dec 8, 2004
9:15 am

Hello Michal, ... I fixed this error. Please get the CVS version and try it. Or at least copy url.(c|h) to your sources, but I cannot guarantee this to work,...
Dirk Stoecker
stoeckerd
Offline Send Email
Dec 9, 2004
9:27 am

It works, thanks, just in the version in CVS in the url.c file the function url_decode_str was declared as char *url_decode_str(const char *urlstr, int len) ...
miso2136
Offline Send Email
Dec 10, 2004
4:13 pm

... From: "Michal TOMA" <michaltoma@...> ... http://www.notzai.info/modules.php?name=Downloads&d_op=viewdownload&cid=2 ... I just looked at RFC 1738 and...
Paul Slusarz
wiedzmin
Offline Send Email
Dec 9, 2004
9:26 am
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help