Search the web
Sign In
New User? Sign Up
waterlanguage · Water Language
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Parsing html pages   Message List  
Reply | Forward Message #683 of 743 |
Re: Parsing html pages

Here's some code using the example you gave:

<!-- Set up web resource -->
<set
x=<web "http://citeseer.ist.psu.edu/Kobayashi00information.html"/> />

<!-- Convert to XHTML and Execute content -->
<set y=x.content.<html_to_xhtml />.<execute /> />

<!-- Extract all 'a' anchor tags one per line (takes several
seconds) -->
y.<get_with_value a_key="_parent" a_value=a returns='all' />.
<join <br /> />


The method 'get_with_value' returns 'all' of the links into a vector
so I can use the links, search them, or whatever. This example
displays the links.

Also, you could chain it all together like this (same code sans
comments):

<web "http://citeseer.ist.psu.edu/Kobayashi00information.html"/>.
content.
<html_to_xhtml />.
<execute />.
<get_with_value a_key="_parent" a_value=a returns='all' />.
<join <br /> />

You could make a method of this and pass a URL as the parameter to
extract links from just about any properly formed web page.

<defmethod extract_links a_url="">
<web a_url />.
content.
<html_to_xhtml />.
<execute />.
<get_with_value a_key="_parent" a_value=a returns='all' />.
<join <br /> />
</defmethod>

<extract_links "http://citeseer.ist.psu.edu/Kobayashi00information.ht
ml" />
<extract_links "http://www.mit.edu/" />


_Merrick

--- In waterlanguage@yahoogroups.com, "skramer072" <skramer072@y...>
wrote:
>
> I see a lot of examples of how to convert Water's HTML objects into
> strings, but I can't find any examples of how to convert a string
of
> HTML into hypertext objects.







Fri Nov 11, 2005 6:13 am

merrick_stemen
Offline Offline
Send Email Send Email

Forward
Message #683 of 743 |
Expand Messages Author Sort by Date

I see a lot of examples of how to convert Water's HTML objects into strings, but I can't find any examples of how to convert a string of HTML into hypertext...
skramer072
Offline Send Email
Nov 10, 2005
1:07 pm

Here's some code using the example you gave: <!-- Set up web resource --> <set x=<web "http://citeseer.ist.psu.edu/Kobayashi00information.html"/> /> <!--...
Merrick Stemen
merrick_stemen
Offline Send Email
Nov 11, 2005
6:14 am
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help