Search the web
Sign In
New User? Sign Up
tdom · tDOM - fast DOM / XPath for Tcl in C
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
tdom / tcl unicode limit?   Message List  
Reply | Forward Message #1883 of 1984 |
Re: [tdom] Re: tdom / tcl unicode limit?



ramsan100 wrote:
> Hello,
>
> I have a program that does the following:
>
> set err [catch { tDOM::xmlReadFile $file encoding } xml]

That doesn't make much sense (read: is probably plain wrong), if $file
is a HTML file.

> ...
>
> set err [catch { dom parse -html $xml } doc]
>
> ...
>
> set xml [$doc asXML]
>
> ----------------
>
> I try to read an HTML file and it crashes the full program with
> a panic when arriving to the last instruction.
>
> I my opinion, a TCL library like tdom should NEVER crash and panic
> in a situation like this. This makes a full program crash depending
> on the contents of a file!!!

As you first write correctly, it doesn't crash, it panics. That is by
design. A lot of people doesn't like this design decision, but only
because you don't like it too doesn't mean it isn't
deliberate. Although I've learned in several discussions about this
topic, that most don't want to hear my arguments.

> If tdom cannot deal with UTF-8 chars with more that 3-bytes, it
> should substitute them with an alternate character (???) and continue.
> Or throw a TCL error. But never crash.

In my book, it is far away from being that simple.

To make the long story (that anyway almost nobody want to hear) short:
you've an encoding problem. That means, you've a very elementary,
real, hard, systematic problem.

The panic is there, to put your noise with most possible severity into
that.

> I have used tdom for many years and it never happened before to
> me. Is it due to a modificacion on the code or the problem has
> always been there?

The tdom core, if used proper, seems to be rock-solid, for all what I
can tell. No, what you see isn't due to a modificacion, that is the
way it is since years and years.

rolf




Thu Oct 2, 2008 11:12 am

rolf@...
Send Email Send Email

Forward
Message #1883 of 1984 |
Expand Messages Author Sort by Date

I recently received this tdom question: [We are using tdom 0.8.0 and our application] was processing an XML document [...] which happened to contain: "𝒜" ...
Larry W. Virden
lvirden
Offline Send Email
Jul 8, 2008
2:05 pm

... No, a newer tdom would not help, without additional work. It's tcl which limits itself to the BMP, in a default build. See tcl.h: define TCL_UTF_MAX...
Rolf Ade
rolf@...
Send Email
Jul 8, 2008
3:09 pm

... I was afraid of that. From a practical point of view, considering that tcl is stretching into the 64 bit processor range, etc., what are the practical...
Larry W. Virden
lvirden
Offline Send Email
Jul 8, 2008
7:02 pm

... You'd better ask the real champs for a more definite and 'official' answer. No, AFAICT the problem is neither that this would make tcl exceptionally slower...
Rolf Ade
rolf@...
Send Email
Jul 8, 2008
8:10 pm

Hello, I have a program that does the following: set err [catch { tDOM::xmlReadFile $file encoding } xml] ... set err [catch { dom parse -html $xml } doc] ... ...
ramsan100
Offline Send Email
Oct 2, 2008
10:22 am

... That doesn't make much sense (read: is probably plain wrong), if $file is a HTML file. ... As you first write correctly, it doesn't crash, it panics. That...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
11:15 am

... No, that's not true. I have not an encoding problem. The user of my program has an encoding problem. As he/she is the one that selects the file, that can...
ramsan100
Offline Send Email
Oct 2, 2008
3:05 pm

... Except for the exceptions of that rule. Do you find it cooperative, to insist, that everyone out there have to follow what you think is right? You want a...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
4:59 pm

... It's been there for years. It's frustrated and annoyed many people who have tried to use tDOM. Perhaps you will be luckier in trying to get Rolf to fix...
Dossy Shiobara
dossy
Online Now Send Email
Oct 2, 2008
5:57 pm

... Really? How many is "many"? "Have tried" means they have tried and left it lying because of the above bug? Then they switched to what? Zoran...
Vasiljevic Zoran
vungerk
Offline Send Email
Oct 2, 2008
6:15 pm

Come on, Dossy. Yes, you're belong to the ones, which has whined about that. And, allow me to say that, you're also belong te ones, which didn't listened to...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
8:33 pm

... Just because my code has a bug in it doesn't mean yours doesn't, too. tDOM is, without question, broken in this regard. Nowhere else in the Tcl core does...
Dossy Shiobara
dossy
Online Now Send Email
Oct 2, 2008
8:38 pm

... Yes, tDOM catch the breaking of a very elementar, basic internal rule in a rude way. At least, you can hardly ignore the notice. And as long as you play...
Rolf Ade
rolf@...
Send Email
Oct 2, 2008
11:28 pm

... Oh, I absolutely care - but, in a multi-threaded application, having the whole thing blow up is not okay. One thread terminating would be one thing, but...
Dossy Shiobara
dossy
Online Now Send Email
Oct 3, 2008
2:02 pm

... I have to agree. Stopping computation and raising an error would be much more user friendly. By that you could catch it and tell your user - sorry, we...
Martin S. Weber
Ephaeton@...
Send Email
Oct 3, 2008
2:24 pm

... Just to stress an analogy: Take any libc function you like and in your mind replace "returning -1 (or returning NULL (or returning whatever value that...
Martin S. Weber
Ephaeton@...
Send Email
Oct 3, 2008
2:34 pm

English isn't my native language; I guess, that's obvious. But normally, I seem to be able to express myself anyway. With the exception of this topic. What I...
Rolf Ade
rolf@...
Send Email
Oct 4, 2008
6:17 pm

... I still feel, I haven't made my point clear enough. Gimme one more try. This is not about panic while feeding a broken document into tdom. If you feed a...
Rolf Ade
rolf@...
Send Email
Oct 4, 2008
10:35 pm

... Nowhere in the Tcl core does it blow up with a Tcl_Panic on the same malformed Tcl_Obj. Only tDOM does. Yet, you're convinced your decision is the...
Dossy Shiobara
dossy
Online Now Send Email
Oct 5, 2008
1:54 am

Sorry to follow up on this heated thread, but i got now the same problem with an "incorrect user input" causing aolserver to restart. The problem happens in...
gustafn
Offline Send Email
Jan 6, 2009
11:21 pm

The earlier part of this thread started with a post or two from me, I think. In my developer's case, they actually got a specific warning message from tDOM...
Larry W. Virden
lvirden
Offline Send Email
Oct 2, 2008
7:02 pm

Hi Dossy, hi Rolf, while I agree with Dossy that a Tcl_Panic() isn't the nicest thing to do in an extension, it still is a valid response. The Tcl core does...
Michael Schlenker
pitacus
Offline Send Email
Oct 5, 2008
2:49 pm

Hello, You seem to agree that you can only arrive to this situation by using an incorrect or buggy C extension. This is not true. The following code can lead...
ramsan100
Offline Send Email
Oct 6, 2008
4:10 pm

... As I already wrote: if $file is in fact an HTML file, it's probably plain wrong, to use xmlReadFile. ... Ahhh! Great. That's probably a bug in xmlReadFile....
Rolf Ade
rolf@...
Send Email
Oct 6, 2008
5:38 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help