Search the web
Sign In
New User? Sign Up
tdom · tDOM - fast DOM / XPath for Tcl in C
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
tdom / tcl unicode limit?   Message List  
Reply | Forward Message #1921 of 1983 |
Re: tdom / tcl unicode limit?

Sorry to follow up on this heated thread, but i got now the same
problem with an "incorrect user input" causing aolserver to restart.
The problem happens in connection with invalid numeric character
reference (missing closing semicolon). Below is a simple code example
with a real-world forum entry, where tDOM is used with the -html flag
to check, if the input is some reasonable HTML code (like a
lightweight tidy).

set html_fragment {
<p>We had some plain html content that had &#147;left and right&#148
double quotes and &#145;left and right&#146 single quotes that were
</p>
}

if {[catch {dom parse -html <body>$html_fragment doc} errorMsg]} {
# we got an error, so do tdom-free error processing
} else {
$doc documentElement root
return [$root asHTML]
}


I would certainly prefer here to get a tcl-error on "dom parse" rather
than to get a panic() on "$root asHTML".

best regards
-gustaf neumann


--- In tdom@yahoogroups.com, Rolf Ade <rolf@...> wrote:
>
> This is not about panic while feeding a broken document into tdom. If
> you feed a broken document - broken in any kind, as in tag not closed,
> including a character, that is not allowed in an XML document, used a
> not declared entity etc. ppp. - into tdom, then you get of course an
> 'ordinary' tcl error, which you could catch and notify your user:
> that's not XML.
>
> This panic fires, if there is strong evidence, that something inside
> gone deeply and untolerable und completely unexpected wrong. A
> possible error msg would be: "Something inside your software is
> broken!" Not in a user input, a document or something: the software is
> broken.
>
> If you see this panic then you must look into this. There is no other
> way. Therefor it's a panic.
>
> rolf
>





Tue Jan 6, 2009 11:21 pm

gustafn
Offline Offline
Send Email Send Email

Forward
Message #1921 of 1983 |
Expand Messages Author Sort by Date

I recently received this tdom question: [We are using tdom 0.8.0 and our application] was processing an XML document [...] which happened to contain: "&Ascr;" ...
Larry W. Virden
lvirden
Offline Send Email
Jul 8, 2008
2:05 pm

... No, a newer tdom would not help, without additional work. It's tcl which limits itself to the BMP, in a default build. See tcl.h: define TCL_UTF_MAX...
Rolf Ade
rolf@...
Send Email
Jul 8, 2008
3:09 pm

... I was afraid of that. From a practical point of view, considering that tcl is stretching into the 64 bit processor range, etc., what are the practical...
Larry W. Virden
lvirden
Offline Send Email
Jul 8, 2008
7:02 pm

... You'd better ask the real champs for a more definite and 'official' answer. No, AFAICT the problem is neither that this would make tcl exceptionally slower...
Rolf Ade
rolf@...
Send Email
Jul 8, 2008
8:10 pm

Hello, I have a program that does the following: set err [catch { tDOM::xmlReadFile $file encoding } xml] ... set err [catch { dom parse -html $xml } doc] ... ...
ramsan100
Offline Send Email
Oct 2, 2008
10:22 am

... That doesn't make much sense (read: is probably plain wrong), if $file is a HTML file. ... As you first write correctly, it doesn't crash, it panics. That...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
11:15 am

... No, that's not true. I have not an encoding problem. The user of my program has an encoding problem. As he/she is the one that selects the file, that can...
ramsan100
Offline Send Email
Oct 2, 2008
3:05 pm

... Except for the exceptions of that rule. Do you find it cooperative, to insist, that everyone out there have to follow what you think is right? You want a...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
4:59 pm

... It's been there for years. It's frustrated and annoyed many people who have tried to use tDOM. Perhaps you will be luckier in trying to get Rolf to fix...
Dossy Shiobara
dossy
Online Now Send Email
Oct 2, 2008
5:57 pm

... Really? How many is "many"? "Have tried" means they have tried and left it lying because of the above bug? Then they switched to what? Zoran...
Vasiljevic Zoran
vungerk
Offline Send Email
Oct 2, 2008
6:15 pm

Come on, Dossy. Yes, you're belong to the ones, which has whined about that. And, allow me to say that, you're also belong te ones, which didn't listened to...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
8:33 pm

... Just because my code has a bug in it doesn't mean yours doesn't, too. tDOM is, without question, broken in this regard. Nowhere else in the Tcl core does...
Dossy Shiobara
dossy
Online Now Send Email
Oct 2, 2008
8:38 pm

... Yes, tDOM catch the breaking of a very elementar, basic internal rule in a rude way. At least, you can hardly ignore the notice. And as long as you play...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
11:28 pm

... Oh, I absolutely care - but, in a multi-threaded application, having the whole thing blow up is not okay. One thread terminating would be one thing, but...
Dossy Shiobara
dossy
Online Now Send Email
Oct 3, 2008
2:02 pm

... I have to agree. Stopping computation and raising an error would be much more user friendly. By that you could catch it and tell your user - sorry, we...
Martin S. Weber
Ephaeton@...
Send Email
Oct 3, 2008
2:24 pm

... Just to stress an analogy: Take any libc function you like and in your mind replace "returning -1 (or returning NULL (or returning whatever value that...
Martin S. Weber
Ephaeton@...
Send Email
Oct 3, 2008
2:34 pm

English isn't my native language; I guess, that's obvious. But normally, I seem to be able to express myself anyway. With the exception of this topic. What I...
Rolf Ade
rolf@...
Send Email
Oct 4, 2008
6:17 pm

... I still feel, I haven't made my point clear enough. Gimme one more try. This is not about panic while feeding a broken document into tdom. If you feed a...
Rolf Ade
rolf@...
Send Email
Oct 4, 2008
10:35 pm

... Nowhere in the Tcl core does it blow up with a Tcl_Panic on the same malformed Tcl_Obj. Only tDOM does. Yet, you're convinced your decision is the...
Dossy Shiobara
dossy
Online Now Send Email
Oct 5, 2008
1:54 am

Sorry to follow up on this heated thread, but i got now the same problem with an "incorrect user input" causing aolserver to restart. The problem happens in...
gustafn
Offline Send Email
Jan 6, 2009
11:21 pm

The earlier part of this thread started with a post or two from me, I think. In my developer's case, they actually got a specific warning message from tDOM...
Larry W. Virden
lvirden
Offline Send Email
Oct 2, 2008
7:02 pm

Hi Dossy, hi Rolf, while I agree with Dossy that a Tcl_Panic() isn't the nicest thing to do in an extension, it still is a valid response. The Tcl core does...
Michael Schlenker
pitacus
Offline Send Email
Oct 5, 2008
2:49 pm

Hello, You seem to agree that you can only arrive to this situation by using an incorrect or buggy C extension. This is not true. The following code can lead...
ramsan100
Offline Send Email
Oct 6, 2008
4:10 pm

... As I already wrote: if $file is in fact an HTML file, it's probably plain wrong, to use xmlReadFile. ... Ahhh! Great. That's probably a bug in xmlReadFile....
Rolf Ade
rolf@...
Send Email
Oct 6, 2008
5:38 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help