Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Hear how Yahoo! Groups has changed the lives of others. Take me there.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 1050 - 1137 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
1050
For the record, there are no GPL-only components in any version of TagSoup. In addition, please note that if someone must have a GPL-licensed version of...
John Cowan
johnwcowan
Online Now Send Email
Apr 2, 2008
5:39 pm
1051
G'day When I use a url with a question mark The server is not happy XMLReader parser = new org.ccil.cowan.tagsoup.Parser(); // tagsoup parser XPathContext...
ccjoz2002
Offline Send Email
Apr 6, 2008
1:10 am
1052
Does TagSoup intend to be turning all the various quote characters into the appropriate unicode entities? I ask because I noticed that a page I parsed...
Michael Giles
michael_a_giles
Online Now Send Email
Apr 10, 2008
4:31 pm
1053
... Fortunately, the answer is "none of the above". :-) Unlike browsers, TagSoup does not do automatic encoding detection. (There is a Java library to do so...
John Cowan
johnwcowan
Online Now Send Email
Apr 10, 2008
6:45 pm
1054
Thanks for the quick response, John. Having gone through all of this pain five years ago for Furl, I should have known to be more explicit about my encodings....
Michael Giles
michael_a_giles
Online Now Send Email
Apr 10, 2008
8:33 pm
1055
I was wondering if TagSoup will meet my needs. I have a server that is in need of a Parser/Substituter to replace a given URL (either img or a) with another...
dougleeper
Online Now Send Email
Apr 12, 2008
6:22 pm
1056
Tagsoup will fix the html. That's its job. For your need, any tool that do regex parse/replacement can do the trick...
Bru, Pierre
pbru_2001
Offline Send Email
Apr 12, 2008
9:42 pm
1057
... No, it isn't. TagSoup doesn't preserve invalid HTML. I'm not sure why you'd want to preserve it, but if that's your requirement, then you should probably...
John Cowan
johnwcowan
Online Now Send Email
Apr 13, 2008
12:26 am
1059
This one has me stumped, and I can't quite track it down. I have some code that does some basic sraping using tagsoup. It works fine with some input, but...
mhelmstetter
Offline Send Email
Apr 24, 2008
2:33 am
1060
... Are you setting any SAX properties or features? (Xalan might be setting some and you wouldn't know it, unfortunately). Is there any chance that you are...
John Cowan
johnwcowan
Online Now Send Email
Apr 24, 2008
5:05 am
1103
hello i have 2 issues with tagsoup 1.2: 1. i have source of the page that detects javascript in this way: <html><head><noscript><meta http-equiv="refresh" ...
Martin Zdila
m.zdila@...
Send Email
May 20, 2008
1:12 pm
1104
... That's easily patched. Get the source and edit src/definitions/html.tssl. Then add "<memberOf group='M_HEAD'/>" after the line "<element name='noscript'...
John Cowan
johnwcowan
Online Now Send Email
May 20, 2008
3:35 pm
1108
Is there a way to keep the body of <script> intact? I have HTML that looks like this: ... <script ...> //<![CDATA[ ... if (myvalue && yourvalue){ //]]> ...
terrelldeppe
Offline Send Email
May 22, 2008
9:41 pm
1109
Looks like the --html option does it, but are there any other side effects I should know about?...
terrelldeppe
Offline Send Email
May 22, 2008
9:59 pm
1113
... It suppresses close-tags on empty elements -- so <hr>, not <hr></hr> -- and it uses minimized attributes in certain cases, so <input checked>, not <input...
John Cowan
johnwcowan
Online Now Send Email
May 23, 2008
3:07 am
1116
hi john thanks for your reply, it helped me a lot! in addition i had to also add <contains group='M_HEAD'/> to <element name='noscript' type='mixed'>. without...
Martin Zdila
m.zdila@...
Send Email
May 23, 2008
9:26 am
1117
hello java -jar tagsoup-1.2.jar http://ppe.sk/news.htm you will see many nested <strong> tags which are not on the original page. is it possible to fix that? ...
Martin Zdila
m.zdila@...
Send Email
May 23, 2008
11:15 am
1119
... Thanks. Quite right. I've added this to the next release. -- In my last lifetime, John Cowan I believed in reincarnation;...
John Cowan
johnwcowan
Online Now Send Email
May 23, 2008
3:44 pm
1120
Hello After parsing (X)HTML document I am allways getting null from Document.getDoctype(). Is that actually implemented? If not, could you please do that? It...
Martin Zdila
m.zdila@...
Send Email
May 28, 2008
12:20 pm
1121
sorry, but my DOMBuilder didn't handle that. bad martin, bad martin :-) ... -- Martin Zdila CTO M-Way Solutions Slovakia s.r.o. Letna 27, 040 01 Kosice ...
Martin Zdila
m.zdila@...
Send Email
May 28, 2008
1:12 pm
1125
... That's a known problem that has to do with tags opened in each of various cells of a table and never closed again. I will fix it in the next release. -- ...
John Cowan
johnwcowan
Online Now Send Email
Jun 1, 2008
7:59 am
1128
Hello Group, I've been using TagSoup with some data for which I do not know the encoding ahead of time and playing around with auto detection of character...
Nitay Joffe
nitay@...
Send Email
Jun 3, 2008
11:52 pm
1129
Hello I hope that intent of tagsoup is to parse ugly HTML to DOM (XML) so that result displayed of both in the modern webbrowser looks the same. It means that...
Martin Zdila
m.zdila@...
Send Email
Jun 9, 2008
7:54 am
1131
Hello I found one page with following structure: <html><head>...</head><noscript><body>...</body></noscript><frameset>...</frameset></html> body was thrown out...
Martin Zdila
m.zdila@...
Send Email
Jun 9, 2008
8:52 am
1132
... Yes and no. TagSoup does attempt to produce output similar to that of Web browsers, but only within the limits of its design model. It does not contain...
John Cowan
johnwcowan
Online Now Send Email
Jun 9, 2008
2:42 pm
1133
... Thanks. I'll add this to the next release. ... When I get time and energy to work on it enough to release it. ... Not at present. ... It's just me, except...
John Cowan
johnwcowan
Online Now Send Email
Jun 9, 2008
2:47 pm
1134
hello ... What I need is simple thing ;-) - let the SAX generates events: open table, open tr, open td, text "cell1", close td, open span, text "err1", close...
Martin Zdila
m.zdila@...
Send Email
Jun 9, 2008
3:12 pm
1135
... You want to modify html.tssl, not html.stml (which is about the lexer). The simplest change *for this specific problem* is probably to add <contains...
John Cowan
johnwcowan
Online Now Send Email
Jun 9, 2008
5:29 pm
1136
... Like John said, TagSoup operates at a lower-level, "below" a dom. So what you can do is to use a tree model such as XOM, and do additional fixing _you_...
Tatu Saloranta
cowtowncoder
Offline Send Email
Jun 9, 2008
7:14 pm
1137
Hello Tatu thanks 4 the reaction ... I am actually using xerces to build DOM from TagSoup and xalan for XPath processing, transformation and serialization....
Martin Zdila
m.zdila@...
Send Email
Jun 9, 2008
8:14 pm
Messages 1050 - 1137 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help