ramsan100 wrote:
> Hello,
>
> I have a program that does the following:
>
> set err [catch { tDOM::xmlReadFile $file encoding } xml]
That doesn't make much sense (read: is probably plain wrong), if $file
is a HTML file.
> ...
>
> set err [catch { dom parse -html $xml } doc]
>
> ...
>
> set xml [$doc asXML]
>
> ----------------
>
> I try to read an HTML file and it crashes the full program with
> a panic when arriving to the last instruction.
>
> I my opinion, a TCL library like tdom should NEVER crash and panic
> in a situation like this. This makes a full program crash depending
> on the contents of a file!!!
As you first write correctly, it doesn't crash, it panics. That is by
design. A lot of people doesn't like this design decision, but only
because you don't like it too doesn't mean it isn't
deliberate. Although I've learned in several discussions about this
topic, that most don't want to hear my arguments.
> If tdom cannot deal with UTF-8 chars with more that 3-bytes, it
> should substitute them with an alternate character (???) and continue.
> Or throw a TCL error. But never crash.
In my book, it is far away from being that simple.
To make the long story (that anyway almost nobody want to hear) short:
you've an encoding problem. That means, you've a very elementary,
real, hard, systematic problem.
The panic is there, to put your noise with most possible severity into
that.
> I have used tdom for many years and it never happened before to
> me. Is it due to a modificacion on the code or the problem has
> always been there?
The tdom core, if used proper, seems to be rock-solid, for all what I
can tell. No, what you see isn't due to a modificacion, that is the
way it is since years and years.
I recently received this tdom question: [We are using tdom 0.8.0 and our application] was processing an XML document [...] which happened to contain: "𝒜" ...
... No, a newer tdom would not help, without additional work. It's tcl which limits itself to the BMP, in a default build. See tcl.h: define TCL_UTF_MAX...
Rolf Ade
rolf@...
Jul 8, 2008 3:09 pm
... I was afraid of that. From a practical point of view, considering that tcl is stretching into the 64 bit processor range, etc., what are the practical...
... You'd better ask the real champs for a more definite and 'official' answer. No, AFAICT the problem is neither that this would make tcl exceptionally slower...
Rolf Ade
rolf@...
Jul 8, 2008 8:10 pm
Hello, I have a program that does the following: set err [catch { tDOM::xmlReadFile $file encoding } xml] ... set err [catch { dom parse -html $xml } doc] ... ...
... That doesn't make much sense (read: is probably plain wrong), if $file is a HTML file. ... As you first write correctly, it doesn't crash, it panics. That...
Rolf Ade
rolf@...
Oct 2, 2008 11:15 am
... No, that's not true. I have not an encoding problem. The user of my program has an encoding problem. As he/she is the one that selects the file, that can...
... Except for the exceptions of that rule. Do you find it cooperative, to insist, that everyone out there have to follow what you think is right? You want a...
Rolf Ade
rolf@...
Oct 2, 2008 4:59 pm
... It's been there for years. It's frustrated and annoyed many people who have tried to use tDOM. Perhaps you will be luckier in trying to get Rolf to fix...
Come on, Dossy. Yes, you're belong to the ones, which has whined about that. And, allow me to say that, you're also belong te ones, which didn't listened to...
Rolf Ade
rolf@...
Oct 2, 2008 8:33 pm
... Just because my code has a bug in it doesn't mean yours doesn't, too. tDOM is, without question, broken in this regard. Nowhere else in the Tcl core does...
... Yes, tDOM catch the breaking of a very elementar, basic internal rule in a rude way. At least, you can hardly ignore the notice. And as long as you play...
Rolf Ade
rolf@...
Oct 2, 2008 11:28 pm
... Oh, I absolutely care - but, in a multi-threaded application, having the whole thing blow up is not okay. One thread terminating would be one thing, but...
... I have to agree. Stopping computation and raising an error would be much more user friendly. By that you could catch it and tell your user - sorry, we...
Martin S. Weber
Ephaeton@...
Oct 3, 2008 2:24 pm
... Just to stress an analogy: Take any libc function you like and in your mind replace "returning -1 (or returning NULL (or returning whatever value that...
Martin S. Weber
Ephaeton@...
Oct 3, 2008 2:34 pm
English isn't my native language; I guess, that's obvious. But normally, I seem to be able to express myself anyway. With the exception of this topic. What I...
Rolf Ade
rolf@...
Oct 4, 2008 6:17 pm
... I still feel, I haven't made my point clear enough. Gimme one more try. This is not about panic while feeding a broken document into tdom. If you feed a...
Rolf Ade
rolf@...
Oct 4, 2008 10:35 pm
... Nowhere in the Tcl core does it blow up with a Tcl_Panic on the same malformed Tcl_Obj. Only tDOM does. Yet, you're convinced your decision is the...
Sorry to follow up on this heated thread, but i got now the same problem with an "incorrect user input" causing aolserver to restart. The problem happens in...
The earlier part of this thread started with a post or two from me, I think. In my developer's case, they actually got a specific warning message from tDOM...
Hi Dossy, hi Rolf, while I agree with Dossy that a Tcl_Panic() isn't the nicest thing to do in an extension, it still is a valid response. The Tcl core does...
Hello, You seem to agree that you can only arrive to this situation by using an incorrect or buggy C extension. This is not true. The following code can lead...
... As I already wrote: if $file is in fact an HTML file, it's probably plain wrong, to use xmlReadFile. ... Ahhh! Great. That's probably a bug in xmlReadFile....