Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Want your group to be featured on the Yahoo! Groups website? Add a group photo to Flickr.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 1211 - 1240 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
1211
... I don't know. I built it as a library, and originally added the stand-alone application support for my own testing purposes, but I suspect that many...
John Cowan
johnwcowan
Online Now Send Email
Dec 1, 2008
5:19 pm
1212
I use tagsoup as one step in DeXSS. http://freshmeat.net/projects/dexss/ ... From: tagsoup-friends@yahoogroups.com [mailto:tagsoup-friends@yahoogroups.com] On...
Klotz, Leigh
leighklotz
Offline Send Email
Dec 1, 2008
5:21 pm
1213
I'm a library user. I used the standalone app for testing, but the projects where I've integrated TagSoup have always been as a library. Is there discussion...
Sujal Shah
sujalnet
Offline Send Email
Dec 1, 2008
5:25 pm
1214
... Definitely not. TagSoup will remain both a library and an app. -- XQuery Blueberry DOM John Cowan Entity parser dot-com...
John Cowan
johnwcowan
Online Now Send Email
Dec 1, 2008
5:42 pm
1215
I use TagSoup as a library. I use it to transform lists on the web into XML so it can be loaded into a database. I also use it in my main application to...
Leslie Software
lesliesoftware
Offline Send Email
Dec 2, 2008
11:30 am
1216
... Various "screen-scraping" jobs. E.g., this one which is just for fun: http://www.edavies.nildram.co.uk/#bumps More details at the bottom of this page: ...
Ed Davies
edavies971
Offline Send Email
Dec 2, 2008
10:56 pm
1217
Greeting. So, I'm using the tagSoup-1.2.jar file as a stand alone program which I shell out to. What I'm trying for here, is to convert in the wild html into...
kiru42
Offline Send Email
Dec 11, 2008
12:03 am
1218
... These are symptoms of specifying the wrong input encoding. You can't specify the input as UTF-8 unless the .html file *really is* encoded in UTF-8, or you...
John Cowan
johnwcowan
Online Now Send Email
Dec 11, 2008
4:07 am
1219
... Recommendation. ... So, I've tried a variety of combinations of --encoding and --output-encoding parameters. The input html does indeed seem to be utf8...
kiru42
Offline Send Email
Dec 11, 2008
4:51 am
1220
... So it is. ... $ tagsoup --encoding=utf-8 --output-encoding=utf-8 <index.html >index.xhtml ... TagSoup can't provide that. It interprets all entity and...
John Cowan
johnwcowan
Online Now Send Email
Dec 11, 2008
6:56 am
1221
... http://www.ccil.org/~cowan ... Okay, I'll have to accept that as the tagSoup behavoir. However, small update. On linux, your command line example works...
kiru42
Offline Send Email
Dec 11, 2008
9:01 am
1222
... Hmm. What versions and patch levels of Java are you using on the two systems? -- John Cowan <cowan@...> http://ccil.org/~cowan Micropayment...
John Cowan
johnwcowan
Online Now Send Email
Dec 11, 2008
4:13 pm
1223
Dear friends is myspace , youtube blocked in office ? Use these sites to unbl0ck those sites onk.in kha.in thanks ceu.in...
Proxy Hunter
proxy_hunter
Offline Send Email
Dec 11, 2008
4:32 pm
1224
... Just to make sure: did you verify actual output file contents (and similarly for input), or view using an app? I ask this because the most common problem...
Tatu Saloranta
cowtowncoder
Offline Send Email
Dec 11, 2008
5:53 pm
1225
... two systems? On the windows machine, it's Java(TM) SE Runtime Environment (build 1.6.0_07-b06).(Official JRE from Sun) On Ubuntu, it's OpenJDK Runtime...
kiru42
Offline Send Email
Dec 11, 2008
6:30 pm
1226
... this platform? ... similarly for input), or view using an app? I ask this because the most common problem reported is usually caused by a viewing app ...
kiru42
Offline Send Email
Dec 11, 2008
6:33 pm
1227
... just ... with ... Let's try to tackle this from a slightly different angle here. For a moment, let's pretend that I'm a random user who has just discovered...
kiru42
Offline Send Email
Dec 11, 2008
7:22 pm
1228
... About the only thing I can think of, as a difference, is that the platform-specific default encoding may well differ between stock windows system vs....
Tatu Saloranta
cowtowncoder
Offline Send Email
Dec 12, 2008
6:43 pm
1229
... If you don't specify any encoding switches, you get the platform default. I wonder if the switches aren't being passed properly. -- John Cowan...
John Cowan
johnwcowan
Online Now Send Email
Dec 12, 2008
6:47 pm
1230
Dear friends is myspace , youtube blocked in office ? Use these sites to unbl0ck those sites hzn.in zum.in thanks cpy.in...
Proxy Hunter
proxy_hunter
Offline Send Email
Dec 12, 2008
6:54 pm
1231
Hrm, not sure. There is a variance in result output depending on which encoding switches I provide on the command line. I'd say that at least *some* of them...
kiru42
Offline Send Email
Dec 15, 2008
8:46 pm
1232
... Oh, you mean the literal command line, as in cmd.exe? I didn't realize that -- I thought you were spawning TagSoup from a program. -- You let them out...
John Cowan
johnwcowan
Online Now Send Email
Dec 16, 2008
6:10 am
1233
... http://ccil.org/~cowan ... I have tested with both, there's no behavior difference in tagSoup. Either I'm on a cmd.exe window calling java -jar to tagSoup,...
kiru42
Offline Send Email
Dec 16, 2008
7:27 pm
1234
... From: kiru42 <urikkiru@...> Subject: [tagsoup-friends] Re: tagSoup v1.2 and utf-8 characters To: tagsoup-friends@yahoogroups.com Date: Tuesday,...
Tatu Saloranta
cowtowncoder
Offline Send Email
Dec 16, 2008
7:41 pm
1235
... realize ... TagSoup from java is as an embedded library; that is, it's called via its API, not from command-line interface. And that may be the key ...
kiru42
Offline Send Email
Dec 16, 2008
8:48 pm
1236
... Indeed, which says that the Java code is not at fault. The only thing I can think of is that the names of the available encodings might be different on...
John Cowan
johnwcowan
Online Now Send Email
Dec 16, 2008
9:10 pm
1237
... Yes, but not the same environment, with respect to the platform default encoding. The most likely scenario is that the default encoding in linux happens to...
Tatu Saloranta
cowtowncoder
Offline Send Email
Dec 16, 2008
9:40 pm
1238
... Quite so, but in this case both --encoding and --output-encoding were specified as UTF-8. ... It seems that if the encoding name specified is not known,...
John Cowan
johnwcowan
Online Now Send Email
Dec 16, 2008
9:50 pm
1239
Dear friends is myspace , youtube blocked in office ? Use these sites to unbl0ck those sites 310.in zum.in thanks iub.in...
Proxy Hunter
proxy_hunter
Offline Send Email
Dec 18, 2008
6:02 pm
1240
Here's a simplified example of the HTML I'm trying to parse: <p> <span id="data"> <p>important information</p> </span> </p> And here's what I get out of...
mark_renouf
Offline Send Email
Jan 8, 2009
3:54 am
Messages 1211 - 1240 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help