Search the web
Sign In
New User? Sign Up
pavuk · Pavuk Webgrabber Mailing List
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Want to share photos of your group with the world? Add a group photo to Flickr.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
A problem w/ link rewriting   Message List  
Reply | Forward Message #688 of 988 |
Hi!

I noticed that pavuk has a slightly inconvinient behavior when doing link
rewriting inside a page. Suppose you have a CGI/ASP-generated page that
has links to itself:

the URL is

http://localhost/test_html.cgi?some=query

and page generated upon this request looks like this:

HTML><HEAD><TITLE>Test index page 1</TITLE></HEAD>
<BODY>
<A NAME="top">
<H1>Test index page 1</H1>
<A HREF="http://localhost/test_html.cgi?some=query#top"> - a
page3 in dir1</A>
</BODY>
</HTML>
-----------------------------------------------------
The link in the page isn't rewritten.

I understand that this is /logically understandable/ behaviour. Technically
speaking
the page should be re-requested each time because it can be really dynamical but
[unfortunately] most "real-world" browsers don't do that (I have tested Mozilla
(1.2b, Gecko/20021022, Linux)).

Please suggest

I have prepared a test CGI, that can demonstrate the problem. It's simple enough
so everyone can adjust it to her/his own environment.




Wed Nov 20, 2002 7:57 am

morozov@...
Send Email Send Email

#!/bin/sh

DATE="`date`"
echo "Content-Type: text/html"
echo
cat <<EOF
<HTML><HEAD><TITLE>Test index page 1</TITLE></HEAD>
<BODY>
<a name="top">
<H1>Test index page 1</H1>
Date = ${DATE}
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<A HREF="/test/dir1/test_html.cgi?${QUERY_STRING}"> - a page3 in dir1</a>
<A HREF="/test/dir1/test_html.cgi?${QUERY_STRING}#top"> - a page3 in dir1</a>
<A HREF="http://localhost/test/dir1/test_html.cgi?${QUERY_STRING}#top"> - a
page3 in dir1</a>
<A HREF="test_html.cgi?${QUERY_STRING}#top"> - a page3 in dir1</a>
<br>
<br>
<br>
</BODY>
</HTML>
EOF


Forward
Message #688 of 988 |
Expand Messages Author Sort by Date

Hi! I noticed that pavuk has a slightly inconvinient behavior when doing link rewriting inside a page. Suppose you have a CGI/ASP-generated page that has links...
Alexey Morozov
morozov@...
Send Email
Nov 20, 2002
7:57 am
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help