Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 550 - 593 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
550
That would be helpful to Maven users. I can help with any questions you have on the process. Here's the basic FAQ with links to the information you'll need: ...
Stephen Duncan Jr
scdjr42
Offline Send Email
Jul 5, 2006
12:13 pm
551
... On the whole I'd rather have someone else do this (you, maybe?). I don't expect a whole lot of new releases, only bug fixes from here on out. -- John Cowan...
John Cowan
johnwcowan
Online Now Send Email
Jul 5, 2006
5:37 pm
553
I want &nbsp; to be translated to 0x20 rather than 0xa0. Do I simply comment out... <entity name='nbsp' codepoint='00A0'/> ... in html.tssl for this and...
Rob Staveley (Tom)
tom_staveley
Online Now Send Email
Jul 13, 2006
10:11 pm
554
... The problem is that U+0020 is very different to U+00A0. The SAX api does allow you to intercept entities using the lexical handler, You should be able to...
David Pashley
david@...
Send Email
Jul 13, 2006
11:43 pm
555
... If you do that, it will return "&nbsp;" (that is, the same as &amp;nbsp; would). ... That will work, but it's IMHO better to use Java operations on the...
John Cowan
johnwcowan
Online Now Send Email
Jul 14, 2006
2:48 am
556
There are probaby some other ugly non-entity characters that I ought to clean too, like Microsoft smart quotes. I'll put some substitution into my SAX...
Rob Staveley
tom_staveley
Online Now Send Email
Jul 14, 2006
8:36 am
557
... In my experiences of writing SAX parsers, the entities would have been expanded by the time characters() method is called. Unknown entities would throw a...
David Pashley
david@...
Send Email
Jul 14, 2006
9:55 am
558
... The first point is quite true: however, I assume that Rob wants to clean all NBSPs, not merely those specified using an entity. TagSoup never throws a...
John Cowan
johnwcowan
Online Now Send Email
Jul 14, 2006
1:26 pm
562
I've loaded a page and used SAX2DOM to create a DOM tree. I then used XPathAPI.selectSingleNode to get a starting point and traversed the subtree. Curiously,...
kookaburratech
Offline Send Email
Aug 12, 2006
12:23 am
563
... Not without something to work from. I need the input page and some information on what XPaths returned what, or a dump of the DOM generated by SAX2DOM as...
John Cowan
johnwcowan
Online Now Send Email
Aug 12, 2006
1:46 am
564
An example source page is - http://finance.yahoo.com/q/cq?d=v1&s=C%2cBCS Examining the table starting at- node = XPathAPI.selectSingleNode(doc, ...
kookaburratech
Offline Send Email
Aug 13, 2006
8:06 pm
567
hi, I would like to make TagSoup bling to user tags. for examples in the folowing html, I would like it to simple ignore (AKA be blind to) the <tag> tag in the...
Bru, Pierre
pbru_2001
Offline Send Email
Aug 29, 2006
3:11 pm
569
... on out. ... And these many months later I finally remember this conversation and do something about it: http://jira.codehaus.org/browse/MAVENUPLOAD-1127 ...
Stephen Duncan Jr
scdjr42
Offline Send Email
Sep 13, 2006
7:32 pm
570
http://repo1.maven.org/maven2/org/ccil/cowan/tagsoup/tagsoup/1.0.1/...
Stephen Duncan Jr
scdjr42
Offline Send Email
Sep 17, 2006
12:44 pm
571
... Thanks. ... -- Do I contradict myself? John Cowan Very well then, I contradict myself. cowan@... I am large, I...
John Cowan
johnwcowan
Online Now Send Email
Sep 17, 2006
7:04 pm
573
Hello, I'm trying to make the handling of < characters more forgiving. By default a < surrounded by space seems to get converted to a &lt; which is good. But...
Nick H
crocochicken74
Offline Send Email
Oct 6, 2006
3:56 pm
574
... Just to explain this output, I'm pretty much just outputting XML as it comes through, so basically TagSoup is interpretting <- as the start of a tag called...
Nick H
crocochicken74
Offline Send Email
Oct 6, 2006
4:08 pm
575
In the end I just decided to do a simple find-and-replace on the input before it even goes into TagSoup... does the trick....
Nick H
crocochicken74
Offline Send Email
Oct 9, 2006
8:52 am
576
... Fair enough. It's really, really hard for the code to decide which uses of < are plausible tags or other things and which are not, since it proceeds like...
John Cowan
johnwcowan
Online Now Send Email
Oct 9, 2006
1:47 pm
577
is "-" a legal 1st char for a tag name ? doesn't TS try to convert it to a regular 1st char by i nserting a "_" i nfrom of it ? Pierre...
Bru, Pierre
pbru_2001
Offline Send Email
Oct 9, 2006
9:19 pm
578
... No, not in xml (it is legal after first char though) ... I guess so, since underscore is legal as the first name char. On the other hand, all HTML tags...
Tatu Saloranta
cowtowncoder
Offline Send Email
Oct 9, 2006
11:05 pm
581
Hi - we're using TagSoup happily with the Xalan XSLT replacement, and we're wondering about the bug that makes the default version not work correctly... Is it...
chconnor
Offline Send Email
Oct 22, 2006
8:09 pm
582
... The bug is about building, not about using; TagSoup doesn't do any XSLT at run time. As for why the XSLT building transform doesn't work with the default...
John Cowan
johnwcowan
Online Now Send Email
Oct 23, 2006
1:29 am
584
Hello, I just started using tagsoup so I don't know of this is normal behavior or a bug or wrong arguments. I'm using the version tagsoup-1.0.1.jar and here is...
Andras Balogh
andraska23
Offline Send Email
Nov 6, 2006
8:37 pm
585
... I admit that's not very good, but it's not clear what general method would be better. Currently TagSoup assumes that "0 CELLPADDING=" is the value of the...
John Cowan
johnwcowan
Online Now Send Email
Nov 6, 2006
9:45 pm
589
Hi there! We are using TagSoup for our Web crawler, and we found for the page at http://www.borngayprocon.org/ TagSoup consider <!-[if IE]> as a comment, and ...
Eugeny N Dzhurinsky
bofh@...
Send Email
Dec 7, 2006
9:10 am
590
Hi! I have recently come across TagSoup and want to see whether I can use it instead of JTidy. I need t be able to clean up HTML documents in a wide range of ...
Jaran Nilsen
jaranmann
Offline Send Email
Dec 7, 2006
12:37 pm
591
... That is because TagSoup does not know which characters can be safely written to which encodings, so it plays safe and uses character references for all...
John Cowan
johnwcowan
Online Now Send Email
Dec 7, 2006
1:22 pm
592
I brought up conditional IE comments a while back. I showed using some pathological examples of IE conditionals that it's impossible to proper SAX events if...
Klotz, Leigh
leighklotz
Offline Send Email
Dec 12, 2006
6:54 pm
593
... Quite so. But there is a bug involving comments that lack the second minus sign: <!-foo--> causes TagSoup to malfunction. -- John Cowan cowan@......
John Cowan
johnwcowan
Online Now Send Email
Dec 12, 2006
8:56 pm
Messages 550 - 593 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help