loewerj@... wrote:
>
> tDOM-0.5alpha2 is now available.
> Changes:
>
> - XPath improvements / bug fixes
> - asHTML serialization
> - begin coding of XML-Namespace features (DOM-level 2 methods)
>
> Best regards,
> Jochen.
>
Huh! That was fast. I already prepared the tDOM0.5
tarball with thread-safety changes and couple of
memory-leak fixes in tclexpat code built in, and was
about to upload it, when I received this one....
Jochen, can I send you my modifications so you
can review them and possibly include in future
distributions ?
The thread-safety changes are compatible
with all Tcl versions starting from Tc8.0.
You just have to put (or omit) --enable-threads
when compiling the Tcl core.
What do you think ?
Cheer's Zoran
tDOM-0.5alpha2 is now available.
Changes:
- XPath improvements / bug fixes
- asHTML serialization
- begin coding of XML-Namespace features (DOM-level 2 methods)
Best regards,
Jochen.
I've uploaded a windows tDOM 0.5 .dll binary to egroups/files.
This is not built from the original tDOM 0.5 Code. I've included the
small patch, I've mailed a day ago. It is compiled with vc 6.0; _not_
stup'ed, for tcl8.3.1.
This .dll should work out of the box with tclsh83 out of the ajuba
(scriptics) tcl/tk8.3.1 windows binary distribution (it does at least
for me, tested with NT 4.0 and windows 2000). Don't forget the tcl tDOM
lib out of the /lib dir of the distribution (you have to correct the
package provide line from 0.4 to 0.5) and an appropriate pkgIndex.tcl.
rolf
There is a small, but serious bug within xpathFuncString. It shows his
head in xpathQuerys like "string(@xyz='abc')" or "boolean(@xyz='abd'),
if the content Node hasn't a xyz attribute. A code example
set doc [dom parse {<root a="b"/>}]
set root [$dom documentElement]
$root selectNodes "string(@c='xyz')"
xpathFuncString returns under some circumstances (char *)NULL. This
(char *) NULL could end up in a strcmp() and - at least with some gnu
libc versions - seg faults. The patch is obvious and, well, small.
(I must admit I wasn't aware of this strcmp ideosyncrasy. Therfor I
said "hmmmm?", after I had isolated the line of code, that raises the
seg fault. Strangely, I didn't found a hint about this, while browsing
standard places like Kernigham/Richie etc. While searching around, I
was amused by this sentence out of the gnu libc info documentation:
"For instance, you could easily compare one string to another in two
lines of C code, but if you use the built-in `strcmp' function, you're
less likely to make a mistake.")
The patch is against 0.5, apply it in general.
rolf
Hi everybody,
there is a 0.5 alpha version of tDOM available on
http://sdf.lonestar.org/~loewerj/tdom.cgi
It contains many smaller DOM bug fixes, XPath improvements/
bug fixes, XPath extension functions, indentation for asXML
and requires even less memory, ...
It's an alpha version, not well tested right now. The
DTD information events/callback might propably change to
the 0.5beta/final version.
Jochen.
>From: rolf@...
>[domNodeObj removeAttribute ...] always seg faults. A code example:
>
> set doc [dom parse {<root a="b"/>}]
> set root [$doc documentElement]
> $root removeAttribute a
>
>The really, really small attached patch is a highlight in my resent
>series of small, simple-minded patches ;-). Yes, this one adds nothing,
>only removes a line. But it's definitely simple-minded, read below.
>
>(A few words, because this things are a bit wired and before you think,
>I'm fooling you. After a first look (OK, maybe after a second look) it
>may be perfectly OK not to free attr->name. This is because while
>parsing every attribute name is stored in a tcl hash table (details in
>startElement in dom.c). The memory used to store the attribute name is
>owned by the attrNames hash table.
>
>But, if you take a second look (or a third, respectively) things are
>not that easy. The domNode method setAttribute allows the user to
>create new attributes. Newly created attributes are not stored in
>attrNames, instead, the attribute name is stddup'ed (details in
>domSetAttributes in dom.c). Either attrNames is a past thing, that
>isn't removed or a new thing to come, because it isn't used within the
>code, only filled. To come up with a more cleaner fix: Jochen, what's
>up with attrNames?)
Sorry, Rolf, all this has already been fixed in tDOM-0.5, release
today/tomorrow. Thanks for your help!
These bugs existed, because some time back I also switched from
Tcl_Alloc for each attribute name to a Tcl hash table, in order
to optimize memory consumption (similar to element names).
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
[domNodeObj removeAttribute ...] always seg faults. A code example:
set doc [dom parse {<root a="b"/>}]
set root [$doc documentElement]
$root removeAttribute a
The really, really small attached patch is a highlight in my resent
series of small, simple-minded patches ;-). Yes, this one adds nothing,
only removes a line. But it's definitely simple-minded, read below.
(A few words, because this things are a bit wired and before you think,
I'm fooling you. After a first look (OK, maybe after a second look) it
may be perfectly OK not to free attr->name. This is because while
parsing every attribute name is stored in a tcl hash table (details in
startElement in dom.c). The memory used to store the attribute name is
owned by the attrNames hash table.
But, if you take a second look (or a third, respectively) things are
not that easy. The domNode method setAttribute allows the user to
create new attributes. Newly created attributes are not stored in
attrNames, instead, the attribute name is stddup'ed (details in
domSetAttributes in dom.c). Either attrNames is a past thing, that
isn't removed or a new thing to come, because it isn't used within the
code, only filled. To come up with a more cleaner fix: Jochen, what's
up with attrNames?)
rolf
--- In tdom@egroups.com, rolf@p... wrote:
>
> Well, some thoughts (of course partly driven by personal
interests)...
> expat is a fairly stable and very fast underlying XML parser, but
not
> validating. Since the nearly recent XML::Parser 2.28 the perl people
> uses an enhanced expat. Still no validation, but there is a
Interface to
> get the info out of the doctype declaration, to do your things by
> yourself. Of course a good thing (and not to much work, to use the
perl
> people expat instead the original, I've some code at hand),
Rolf, for tDOM-0.5 I wanted to switch to Expat from XML::Parser2.29.
Not all is complete right now. What code do you have? May be
I can leverage?
Jochen.
Was on a short vacation and should leave now, but a short
reply:
--- In tdom@egroups.com, rolf@p... wrote:
>
> A xpathQuery containing multiple alternatives separated by | returns
> the error "| requires node sets!", if one or more of the
alternatives
> results in an emtpy nodeset. A code example:
>
> set doc [dom parse {<root><a/><b/></root>}]
> set root [$doc documentElement]
> $root selectNodes "a|c"
>
> The problem is the if statment starting in Line 1814 of domxpath.c.
It
> doesn't respect, that the empty result is also a nodeset - the empty
> nodeset. My (simple minded) fix isn't deeply checked. Eventually
there
> are unexpected sideeffects under some circumstances (Jochen?).
> The tDOM xpath functions number(), string(), string-length() and
> normalize-space() requires exactly one arg. But the spec allows to
> omit this arg; they defaults to the current node then. (No simple
> minded fast fix this time, exceptionally. I doesn't found one within
> the few minutes, and to get my simple minded basic tDOM based tcl
xslt
> processor out of the door as some kind of "look, what's may be
> possible" befor the European tcl user meeting has higher priority.
Will evaluate/apply the patch or find other solutions over the
weekend. tDOM-0.5 alpha should be released next week
(before Tcl User Meeting in Hamburg).
The simpled-minded XSLT with tDOM looks very interesting!!!
Jochen.
On 8 Jun, Steve Ball wrote:
> rolf@... wrote:
>> expat is a fairly stable and very fast underlying XML parser, but not
>> validating. Since the nearly recent XML::Parser 2.28 the perl people
>> uses an enhanced expat.
>
> Ajuba (nee Scriptics) hacked expat for the same effect. If you check
> out the expat module from the Ajuba NetCVS repository and build
> TclExpat using it, you get callbacks for the markup declarations.
Well, the scriptics AttlistDecl callback offers only the attribute
name, not the details, like enumeration etc. You can't reach complete
DTD validation level with this. The perl version does it better (with
a slightly wired Interface, but nevertheless).
> Again, no validation but you can do your own thing. In particular,
> one may need to handle entity replacement and resolve external
> entities.
XML::Parser 2.28 offers already some infrastruture, to do this. But
anyway, as I pointed out, I try to avoid DTD validation, skiping
directly to XML schema.
>> [...]
>> lot of hassle on the way to DTD validation. And much more worse, DTD
>> validation isn't enough, in a lot of cases. XML schema offers a lot of
>> more control - and you need only, what you have: an XML parser. As
>> always, a C implementation would be best, but how often will you get
>> the best, at the first try? I plan, to implement a small XML schema
>> subset at tcl level within the next 4 weeks (because I need it in a
>> real word project).[...]
>
> Absolutely agree: the fact that XML Schemas are themselves XML documents
> has the property of making it easy to parse. Skipping straight over
> DTDs to Schemas is the way to go at this stage (but we need a DTD to
> Schema translator - a pet project for someone on the list?).
Some resonable tools out there exists that do that (although not open
source). Within real world business processes, schema definitions
doesn't change often. Personally, I doesn't feel a presure need for a
DTD to Schema translator.
rolf
rolf@... wrote:
>
> Well, some thoughts (of course partly driven by personal interests)...
> expat is a fairly stable and very fast underlying XML parser, but not
> validating. Since the nearly recent XML::Parser 2.28 the perl people
> uses an enhanced expat.
Ajuba (nee Scriptics) hacked expat for the same effect. If you check
out the expat module from the Ajuba NetCVS repository and build
TclExpat using it, you get callbacks for the markup declarations.
Again, no validation but you can do your own thing. In particular,
one may need to handle entity replacement and resolve external
entities.
BTW, my Tcl-only parser now also has callbacks for markup declarations,
can resolve external entities, etc.
> Still no validation, but there is a Interface to
> get the info out of the doctype declaration, to do your things by
> yourself. Of course a good thing (and not to much work, to use the perl
> people expat instead the original, I've some code at hand), but still a
> lot of hassle on the way to DTD validation. And much more worse, DTD
> validation isn't enough, in a lot of cases. XML schema offers a lot of
> more control - and you need only, what you have: an XML parser. As
> always, a C implementation would be best, but how often will you get
> the best, at the first try? I plan, to implement a small XML schema
> subset at tcl level within the next 4 weeks (because I need it in a
> real word project). No special effort needed for my personal case, but
> in general I love the idea of some kind of "hook" mechanism for the
> SAX events. Comments?
Absolutely agree: the fact that XML Schemas are themselves XML documents
has the property of making it easy to parse. Skipping straight over
DTDs to Schemas is the way to go at this stage (but we need a DTD to
Schema translator - a pet project for someone on the list?).
The only thing I ask is that the Schema Validator be built as a package
on top of TclXML v2.0, that way people can use whatever parser they
like underneath - Tcl-only, expat or Xerces-C. In this scenario,
I can help out with the coding too.
Cheers,
Steve Ball
--
Steve Ball | Swish XML Editor | Training & Seminars
Zveno Pty Ltd | Web Tcl Complete | XML XSL
http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development
Steve.Ball@... +-----------------------+---------------------
Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099
Well, some thoughts (of course partly driven by personal interests)...
expat is a fairly stable and very fast underlying XML parser, but not
validating. Since the nearly recent XML::Parser 2.28 the perl people
uses an enhanced expat. Still no validation, but there is a Interface to
get the info out of the doctype declaration, to do your things by
yourself. Of course a good thing (and not to much work, to use the perl
people expat instead the original, I've some code at hand), but still a
lot of hassle on the way to DTD validation. And much more worse, DTD
validation isn't enough, in a lot of cases. XML schema offers a lot of
more control - and you need only, what you have: an XML parser. As
always, a C implementation would be best, but how often will you get
the best, at the first try? I plan, to implement a small XML schema
subset at tcl level within the next 4 weeks (because I need it in a
real word project). No special effort needed for my personal case, but
in general I love the idea of some kind of "hook" mechanism for the
SAX events. Comments?
rolf
A xpathQuery containing multiple alternatives separated by | returns
the error "| requires node sets!", if one or more of the alternatives
results in an emtpy nodeset. A code example:
set doc [dom parse {<root><a/><b/></root>}]
set root [$doc documentElement]
$root selectNodes "a|c"
The problem is the if statment starting in Line 1814 of domxpath.c. It
doesn't respect, that the empty result is also a nodeset - the empty
nodeset. My (simple minded) fix isn't deeply checked. Eventually there
are unexpected sideeffects under some circumstances (Jochen?).
The tDOM xpath functions number(), string(), string-length() and
normalize-space() requires exactly one arg. But the spec allows to
omit this arg; they defaults to the current node then. (No simple
minded fast fix this time, exceptionally. I doesn't found one within
the few minutes, and to get my simple minded basic tDOM based tcl xslt
processor out of the door as some kind of "look, what's may be
possible" befor the European tcl user meeting has higher priority.
rolf
Derk Muenchhausen <derk.muenchhausen@n...> wrote:
>
> I just found some problems:
>
> 1. tDOMs unknown method is a problem, if I use points within
> command-names:
> It crashes ;-)
> I just have put tDOM unknown under comment, so it works. My idea:
> Maybe use a more spezial name convention for tDOM commands or
> introduce a switch...
A check for domNode* or domDoc* at the beginning of a command in
the "obj.method1.method2" syntax could solve it.
> 2. A Wish: tDOM asXML is very fast and useful! A switch would be
nice
> to
> leave "indent-spaces" away.
> I use tDOM for RPCs, so performance and "compactness" is most
> important.
I added a "-indent <number>" option for asXML in tDOM-0.5 today.
$node asXML -indent 0
will then produce each on a separate line, but with no indentation
at all.
Default is still 4, maximum 8.
> 3. A found bug: removeAttr crashes. The original version tDOM
0.4beta
> with
> my correction:
> int
> domRemoveAttribute (
> domNode *node,
> char *attributeName
> )
> {
> domAttrNode *attr, *previous = NULL;
>
> /*----------------------------------------------------
> | try to find the attribute
> \---------------------------------------------------*/
> attr = node->firstAttr;
> while (attr && strcmp(attr->nodeName, attributeName)) {
> previous = attr;
> attr = attr->nextSibling;
> }
>
> // ORIGINAL: if (attr) {
> // my correction:
> if (attr && strcmp(attr->nodeName, attributeName)==0) {
>
> if (previous) {
> previous->nextSibling = attr->nextSibling;
> } else {
> attr->parentNode->firstAttr = attr->nextSibling;
> }
> if (!attr->nextSibling) {
> attr->parentNode->lastAttr = previous;
> }
> Tcl_Free (attr->nodeValue);
>
> // ORIGINAL: free (attr->nodeName);
> // my correction: leave it away... ;-)
>
> Tcl_Free ((void*)attr);
> return 0;
> }
> return -1;
> }
Sorry I can't see problem here, why this would behave bad.
attr = node->firstAttr;
while (attr && strcmp(attr->nodeName, attributeName)) {
previous = attr;
attr = attr->nextSibling;
}
should leave with attr == NULL, if the attribute name wasn't
found (or no attribute at all are there).
But the "free (attr->nodeName)" is wrong, since in the
mean time nodeName points to a hash table entry.
Unfortunately domSetAttribute had still the old code :-(
which I now corrected.
Just wait for tDOM-0.5
Hi again,
Derk Muenchhausen <derk.muenchhausen@...> wrote:
I just found some problems:
1. tDOMs unknown method is a problem, if I use points within
command-names:
It crashes ;-)
I just have put tDOM unknown under comment, so it works. My idea:
Maybe use a more spezial name convention for tDOM commands or
introduce a switch...
2. A Wish: tDOM asXML is very fast and useful! A switch would be nice
to
leave "indent-spaces" away.
I use tDOM for RPCs, so performance and "compactness" is most
important.
3. A found bug: removeAttr crashes. The original version tDOM 0.4beta
with
my correction:
int
domRemoveAttribute (
domNode *node,
char *attributeName
)
{
domAttrNode *attr, *previous = NULL;
/*----------------------------------------------------
| try to find the attribute
\---------------------------------------------------*/
attr = node->firstAttr;
while (attr && strcmp(attr->nodeName, attributeName)) {
previous = attr;
attr = attr->nextSibling;
}
// ORIGINAL: if (attr) {
// my correction:
if (attr && strcmp(attr->nodeName, attributeName)==0) {
if (previous) {
previous->nextSibling = attr->nextSibling;
} else {
attr->parentNode->firstAttr = attr->nextSibling;
}
if (!attr->nextSibling) {
attr->parentNode->lastAttr = previous;
}
Tcl_Free (attr->nodeValue);
// ORIGINAL: free (attr->nodeName);
// my correction: leave it away... ;-)
Tcl_Free ((void*)attr);
return 0;
}
return -1;
}
Tkanks Rolf, for the both fixes -- will be in tDOM-0.5
Jochen.
>From: rolf@...
>Reply-To: tdom@egroups.com
>To: tdom@egroups.com
>Subject: [tdom] Small bug fixes
>Date: Wed, 17 May 2000 02:35:47 +0200 (MEST)
>
>
>dom "called with nothing than a very long (more than 99 chars) String
>as the only argument seg faults for 0.4 (instead of giving an error
>messages) and this has bothered me therefor I fix it with the appended
>simple minded patch"
>
>Since I'm writing and to give a little bit more than this lame pice
>above I've added one of my personal favorits: The second patch modify
>the Makefile.in in a way, that configure creates a Makefile with
>tcldomsh depending on the files in generic.
>
>Apply tcldom.patch in generic, Makefile.in.patch in unix.
>
>rolf
>
>
><< tcldom.patch >>
><< Makefile.in.patch >>
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
>From: rolf@...
>appendChild func in some cases. To set the parentNode of a removed to
>NULL (the line that does this is the hole patch) should be perfectly
>OK since the fragment list is managed as one long list of siblings. Is
>this all true, Jochen?
Absolut correct. I applied your patch for tDOM-0.5.
> > % dom parse {
> > <test>
> > <a n="dr"/>
> > <a n="re"/>
> > <a n="ab"/>
> > </test>
> > }
> > domDoc1
> > % domDoc1 documentElement
> > domNode1
> > % domNode1 childNodes
> > domNode2 domNode3 domNode4
> > % domDoc1 createElement group
> > domNode5
> > % domNode1 appendChild domNode5
> > domNode5
> > % domNode1 removeChild domNode2
> > domNode2
> > % domNode5 appendChild domNode2
> > child already appended!
> > %
________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
The appended patch fixes this. Apply it in the generic dir. The
parentNode of a removed node wasn't set to NULL and this bothered the
appendChild func in some cases. To set the parentNode of a removed to
NULL (the line that does this is the hole patch) should be perfectly
OK since the fragment list is managed as one long list of siblings. Is
this all true, Jochen?
rolf
On 17 May, Artur Trzewik wrote:
> [...]
> % dom parse {
> <test>
> <a n="dr"/>
> <a n="re"/>
> <a n="ab"/>
> </test>
> }
> domDoc1
> % domDoc1 documentElement
> domNode1
> % domNode1 childNodes
> domNode2 domNode3 domNode4
> % domDoc1 createElement group
> domNode5
> % domNode1 appendChild domNode5
> domNode5
> % domNode1 removeChild domNode2
> domNode2
> % domNode5 appendChild domNode2
> child already appended!
> %
--- /home/rolf/tmp/tDOM-0.4/generic/dom.c Thu Dec 30 21:22:35 1999
+++ dom.c Wed May 17 22:54:06 2000
@@ -798,6 +798,8 @@
child->ownerDocument->fragments = child;
child->nextSibling = NULL;
}
+ child->parentNode = NULL;
+ printf ("parentNode of removed Child set to NULL\n");
child->previousSibling = NULL;
return OK;
}
Hello!
man can not move tags to substuctures
with
removeNode
and
appendNode
caused Error child already appended! (but it was removed)
from XML like
<test>
<a n="dr"/>
<a n="re"/>
<a n="ab"/>
</test>
to
<test>
<a n="re"/>
<a n="ab"/>
<group>
<a n="dr">
</group>
</test>
next on German
Man kann mit tDom ein Tag nicht in eine andere Tag der gleichen Ebene
bewegen.
tDom weigert die Operation mit Meldung (child already appended)
Bitte das Beispiel interaktiv durchgehen.
% dom parse {
<test>
<a n="dr"/>
<a n="re"/>
<a n="ab"/>
</test>
}
domDoc1
% domDoc1 documentElement
domNode1
% domNode1 childNodes
domNode2 domNode3 domNode4
% domDoc1 createElement group
domNode5
% domNode1 appendChild domNode5
domNode5
% domNode1 removeChild domNode2
domNode2
% domNode5 appendChild domNode2
child already appended!
%
#Und nacher das Skurille
% domNode2 parentNode
domNode1
% domNode1 childNodes
domNode3 domNode4 domNode5
In der letzten Zeilen steht vielleicht ein Hinweis auf möglichen
Programmierfehler
domNode2 gibt domNode1 als ein Parent obwohl es doch entfern wurde.
Die ähnlichen Operationen sind aber möglich, wenn es nicht das gleiche
Ast des Baumes ist.
Ich habe auch keine Möglichkeit gefunden, die Noden wirklich zu löschen.
removeChild stellt sie nach Dokumentation in die "document fragment
list".
Also sie besetzten immer noch Speicherplatz bis ganzes Dokument gelöscht
wird.
Artur Trzewik
tdom Moderator wrote:
>
> Hello,
>
> Welcome to the tdom group at eGroups, a
> free, easy-to-use email group service. Please
> take a moment to review this message.
>
> To start sending messages to members of this group,
> simply send email to
>
> tdom@egroups.com
>
> If you do not wish to belong to tdom, you may
> unsubscribe by sending an email to
>
> tdom-unsubscribe@egroups.com
>
> You may also visit the eGroups web site to modify your
> subscriptions:
>
> http://www.egroups.com/mygroups
>
> Regards,
>
> Moderator, tdom
>
>
dom "called with nothing than a very long (more than 99 chars) String
as the only argument seg faults for 0.4 (instead of giving an error
messages) and this has bothered me therefor I fix it with the appended
simple minded patch"
Since I'm writing and to give a little bit more than this lame pice
above I've added one of my personal favorits: The second patch modify
the Makefile.in in a way, that configure creates a Makefile with
tcldomsh depending on the files in generic.
Apply tcldom.patch in generic, Makefile.in.patch in unix.
rolf
--- /home/rolf/tmp/tDOM-0.4/generic/tcldom.c Mon Jan 17 21:31:36 2000
+++ ../generic/tcldom.c Wed May 17 01:21:58 2000
@@ -2210,9 +2210,14 @@
/*--------------------------------------------------------
| try to find method implemented as normal Tcl proc
\-------------------------------------------------------*/
+
+ if (strlen (method) < 100) {
sprintf (tmp, "::dom::DOMImplementation::%s",method);
DBG(fprintf(stderr, "testing %s\n", tmp);)
result = Tcl_GetCommandInfo (interp, tmp, &cmdInfo);
+ } else {
+ result = 0;
+ }
if (!result) {
SetResult (interp, dom_usage);
return TCL_ERROR;
A few weeks ago I hacked a makefile that seems to work with MS Visual
C++ 6.0
My knowlegde about the MS Visual C++ products is quit
limited. Therefor I started with the makefile.vc out of the tcl8.3.0
distribution (maybe not the wisest desicion, probably the win makefile
of a small extension like tcLex would have be a better starting point)
and hacked my way with - hopefully - some common sense, my much
broader general knowlegde about such topics and some try and error.
I stopped immediatly after I got something working. Since this I found
no motivation or even time to go futher. Since this seems to be true
also for the next weeks, I decided to give this to a broather audience,
with my excuses for the desolate status of the hack.
Pleace notice:
- This thing isn't cleaned up in any way.
- There isn't a win tcldomsh, you have to use the dll.
- There is no installation rule. You have to install the dll by
hand. Create a child dir in lib/tcl8.3 below your tcl installation,
name it, say, tDOM and copy the dll into this dir. Don't forget to do
also with tdom.tcl out of tDOM-0.4/lib. You also have to create an
appropriate pkgIndex.tcl file.
- the dll is linked against libtcl, not libstubtcl.
- there remains some compiler warnings
The patch adds a few defines and changes only one thing: the
encodingName argument to the dom setResultEncoding method becomes case
sensitive.
Apply patch -p0 < ms.patch within the parent dir of your clean tDOM-0.4
source code installation.
I'll upload a tDOM04.dll to egroups tDOM file area that works out of
the box with a scriptics tclsh binary 8.3.1 (at least for me). This is
tested only on Windows 2000.
rolf
diff -u -r -N tDOM-0.4-orig/generic/tcldom.h tDOM-0.4/generic/tcldom.h
--- tDOM-0.4-orig/generic/tcldom.h Mon Jan 17 19:42:08 2000
+++ tDOM-0.4/generic/tcldom.h Sun May 14 23:12:54 2000
@@ -49,3 +49,24 @@
#endif
+
+
+#define STR_TDOM_VERSION(v) ("0.4")
+#define DLL_BUILD
+# undef TCL_STORAGE_CLASS
+# define TCL_STORAGE_CLASS
+# define DLLEXPORT __declspec(dllexport)
+#ifndef STATIC_BUILD
+#if defined(_MSC_VER)
+# define EXPORT(a,b) __declspec(dllexport) a b
+# define DllEntryPoint DllMain
+#else
+# if defined(__BORLANDC__)
+# define EXPORT(a,b) a _export b
+# else
+# define EXPORT(a,b) a b
+# endif
+#endif
+#endif
+
+
diff -u -r -N tDOM-0.4-orig/generic/tdominit.c tDOM-0.4/generic/tdominit.c
--- tDOM-0.4-orig/generic/tdominit.c Mon Jan 17 21:58:20 2000
+++ tDOM-0.4/generic/tdominit.c Tue May 09 21:34:29 2000
@@ -45,9 +45,7 @@
#include <dom.h>
#include <tcldom.h>
-
-#define STR_TDOM_VERSION(v) (#v)
-
+EXTERN EXPORT(int,Tdom_Init) _ANSI_ARGS_((Tcl_Interp *interp));
/*
*----------------------------------------------------------------------------
@@ -69,8 +67,9 @@
Tdom_Init (interp)
Tcl_Interp *interp; /* Interpreter to initialise. */
{
- /* Tcl_InitStubs(interp, "8.1", 0);
- */
+#ifdef USE_TCL_STUBS
+ Tcl_InitStubs(interp, "8.3", 0);
+#endif
domModuleInitialize();
Tcl_Eval (interp,"rename unknown unknown_tdom");
diff -u -r -N tDOM-0.4-orig/generic/utf8conv.c tDOM-0.4/generic/utf8conv.c
--- tDOM-0.4-orig/generic/utf8conv.c Fri Nov 12 10:01:24 1999
+++ tDOM-0.4/generic/utf8conv.c Sun May 14 23:11:52 2000
@@ -82,7 +82,11 @@
while (encoding && encoding->name) {
DBG(fprintf(stderr, "encoding=%x encoding->name='%s' name='%s'",
encoding, encoding->name, name);)
+#if defined(_MSC_VER)
+ if (strcmp(encoding->name,name)==0) {
+#else
if (strcasecmp(encoding->name,name)==0) {
+#endif
return encoding;
}
encoding++;
diff -u -r -N tDOM-0.4-orig/win/makefile.vc tDOM-0.4/win/makefile.vc
--- tDOM-0.4-orig/win/makefile.vc Thu Jan 01 01:00:00 1970
+++ tDOM-0.4/win/makefile.vc Sun May 14 23:52:35 2000
@@ -0,0 +1,234 @@
+# This is derivated from the tcl8.3 win makefile (C) by Scriptics Corp.
+# _and not cleaned up in any way_
+# blame rolf@...
+#
+# Project directories
+#
+# ROOT = top of source tree
+#
+# TOOLS32 = location of VC++ 32-bit development tools. Note that the
+# VC++ 2.0 header files are broken, so you need to use the
+# ones that come with the developer network CD's, or later
+# versions of VC++.
+#
+# INSTALLDIR = where the install- targets should copy the binaries and
+# support files
+#
+
+# Set this to the appropriate value of /MACHINE: for your platform
+MACHINE = IX86
+
+ROOT = ..
+INSTALLDIR = c:\Progra~1\Tcl
+
+TOOLS32 = C:\Programme\Microsoft Visual Studio\VC98
+TOOLS32_rc = C:\Programme\Microsoft Visual Studio\Common\MSDev98
+
+# Uncomment the following line to compile with thread support
+#THREADDEFINES = -DTCL_THREADS=1
+
+# Set NODEBUG to 0 to compile with symbols
+NODEBUG = 1
+
+# The following defines can be used to control the amount of debugging
+# code that is added to the compilation.
+#
+# -DTCL_MEM_DEBUG Enables the debugging memory allocator.
+# -DTCL_COMPILE_DEBUG Enables byte compilation logging.
+# -DTCL_COMPILE_STATS Enables byte compilation statistics gathering.
+# -DUSE_TCLALLOC=0 Disables the Tcl memory allocator in favor
+# of the native malloc implementation. This is
+# needed when using Purify.
+#
+#DEBUGDEFINES = -DTCL_MEM_DEBUG -DTCL_COMPILE_DEBUG -DTCL_COMPILE_STATS
+#DEBUGDEFINES = -DUSE_TCLALLOC=0
+
+######################################################################
+# Do not modify below this line
+######################################################################
+
+NAMEPREFIX = tdom
+STUBPREFIX = $(NAMEPREFIX)stub
+DOTVERSION = 0.4
+VERSION = 04
+
+BINROOT = .
+!IF "$(NODEBUG)" == "1"
+TMPDIRNAME =
+DBGX =
+!ELSE
+TMPDIRNAME = Debug
+DBGX = d
+!ENDIF
+TMPDIR = $(BINROOT)
+OUTDIRNAME = $(TMPDIRNAME)
+OUTDIR = $(TMPDIR)
+TOP_DIR = $(BINROOT)\..
+GENERIC_DIR = $(TOP_DIR)\generic
+EXPAT_GENNM_DIR = $(TOP_DIR)\expat\gennmtab
+EXPAT_PARSE_DIR = $(TOP_DIR)\expat\xmlparse
+EXPAT_TOK_DIR = $(TOP_DIR)\expat\xmltok
+
+
+TDOMLIB = $(OUTDIR)\$(NAMEPREFIX)$(VERSION)$(DBGX).lib
+TDOMDLLNAME = $(NAMEPREFIX)$(VERSION)$(DBGX).dll
+TDOMDLL = $(OUTDIR)\$(TDOMDLLNAME)
+
+TCLSTUBLIBNAME = $(STUBPREFIX)$(VERSION)$(DBGX).lib
+TCLSTUBLIB = $(OUTDIR)\$(TCLSTUBLIBNAME)
+
+TCLSH = $(OUTDIR)\$(NAMEPREFIX)sh$(VERSION)$(DBGX).exe
+TCLSHP = $(OUTDIR)\$(NAMEPREFIX)shp$(VERSION)$(DBGX).exe
+TCLPIPEDLLNAME = $(NAMEPREFIX)pip$(VERSION)$(DBGX).dll
+TCLPIPEDLL = $(OUTDIR)\$(TCLPIPEDLLNAME)
+TCLREGDLLNAME = $(NAMEPREFIX)reg$(VERSION)$(DBGX).dll
+TCLREGDLL = $(OUTDIR)\$(TCLREGDLLNAME)
+TCLDDEDLLNAME = $(NAMEPREFIX)dde$(VERSION)$(DBGX).dll
+TCLDDEDLL = $(OUTDIR)\$(TCLDDEDLLNAME)
+TCLTEST = $(OUTDIR)\$(NAMEPREFIX)test.exe
+CAT32 = $(TMPDIR)\cat32.exe
+MKDIR = .\mkd.bat
+RM = del
+
+LIB_INSTALL_DIR = $(INSTALLDIR)\lib
+BIN_INSTALL_DIR = $(INSTALLDIR)\bin
+SCRIPT_INSTALL_DIR = $(INSTALLDIR)\lib\tcl$(DOTVERSION)
+INCLUDE_INSTALL_DIR = $(INSTALLDIR)\include
+
+
+TDOMOBJS = $(TMPDIR)\xmlrole.obj \
+ $(TMPDIR)\xmltok.obj \
+ $(TMPDIR)\xmlparse.obj \
+ $(TMPDIR)\xmlsimple.obj \
+ $(TMPDIR)\hashtable.obj \
+ $(TMPDIR)\utf8conv.obj \
+ $(TMPDIR)\dom.obj \
+ $(TMPDIR)\domxpath.obj \
+ $(TMPDIR)\tclexpat.obj \
+ $(TMPDIR)\tcldom.obj \
+ $(TMPDIR)\tdominit.obj
+
+cc32 = "$(TOOLS32)\bin\cl.exe"
+link32 = "$(TOOLS32)\bin\link.exe"
+rc32 = "$(TOOLS32_rc)\bin\rc.exe"
+include32 = -I"$(TOOLS32)\include"
+libpath32 = /LIBPATH:"$(TOOLS32)\lib"
+tcllibpath = /LIBPATH:"C:\Programme\Tcl\lib"
+lib32 = "$(TOOLS32)\bin\lib.exe"
+
+WINDIR = $(ROOT)\win
+GENERICDIR = $(ROOT)\generic
+XMLTOKDIR = $(ROOT)\expat\xmltok
+XMLPARSERDIR = $(ROOT)\expat\xmlparse
+TCLINCDIR = C:\Programme\Tcl\Include
+
+TCL_INCLUDES = -I"$(WINDIR)" -I"$(GENERICDIR)" -I"$(XMLTOKDIR)"
-I"$(XMLPARSERDIR)" -I"$(TCLINCDIR)"
+TCL_DEFINES = $(DEBUGDEFINES) $(THREADDEFINES)
+
+######################################################################
+# Compile flags
+######################################################################
+
+!IF "$(NODEBUG)" == "1"
+# This cranks the optimization level to maximize speed
+cdebug = -O2 -Gs -GD
+!ELSE
+!IF "$(MACHINE)" == "IA64"
+cdebug = -Od -Zi
+!ELSE
+cdebug = -Z7 -Od -WX
+!ENDIF
+!ENDIF
+
+# declarations common to all compiler options
+#cflags = -c -W3 -nologo -Fp$(TMPDIR)\ -YX -DXML_DTD -DXML_NS -DUSE_TCL_STUBS
+cflags = -c -W3 -nologo -Fp$(TMPDIR)\ -YX -DXML_DTD -DXML_NS
+cvarsdll = -MD$(DBGX)
+
+TCL_CFLAGS = $(cdebug) $(cflags) $(cvarsdll) $(include32) \
+ $(TCL_INCLUDES) $(TCL_DEFINES)
+CON_CFLAGS = $(cdebug) $(cflags) $(include32) -DCONSOLE
+
+######################################################################
+# Link flags
+######################################################################
+
+!IF "$(NODEBUG)" == "1"
+ldebug = /RELEASE
+!ELSE
+ldebug = -debug:full -debugtype:cv
+!ENDIF
+
+# declarations common to all linker options
+lflags = /NODEFAULTLIB /NOLOGO /MACHINE:$(MACHINE) $(libpath32) $(tcllibpath)
+
+# declarations for use on Intel i386, i486, and Pentium systems
+DLLENTRY = @12
+dlllflags = $(lflags) -entry:_DllMainCRTStartup$(DLLENTRY) -dll
+
+
+conlflags = $(lflags) -subsystem:console -entry:mainCRTStartup
+guilflags = $(lflags) -subsystem:windows -entry:WinMainCRTStartup
+
+libc = libc$(DBGX).lib oldnames.lib
+libcdll = msvcrt$(DBGX).lib oldnames.lib
+
+#baselibs = kernel32.lib $(optlibs) advapi32.lib user32.lib tclstub83.lib
+baselibs = kernel32.lib $(optlibs) advapi32.lib user32.lib tcl83.lib
+winlibs = $(baselibs) gdi32.lib comdlg32.lib winspool.lib
+
+guilibs = $(libc) $(winlibs)
+conlibs = $(libc) $(baselibs)
+guilibsdll = $(libcdll) $(winlibs)
+conlibsdll = $(libcdll) $(baselibs)
+
+######################################################################
+# Project specific targets
+######################################################################
+
+release: setup dlls
+dlls: setup $(TDOMDLL)
+all: setup dlls $(CAT32)
+
+setup:
+# @$(MKDIR) $(TMPDIR)
+# @$(MKDIR) $(OUTDIR)
+
+$(TDOMLIB): $(TDOMDLL)
+
+$(TDOMDLL): $(TDOMOBJS) $(TMPDIR)\tdom.res
+ $(link32) $(ldebug) $(dlllflags) \
+ -out:$@ $(TMPDIR)\tdom.res $(guilibsdll) @<<
+$(TDOMOBJS)
+<<
+
+#
+# Implicit rules
+#
+
+{$(EXPAT_GENNM_DIR)}.c{$(TMPDIR)}.obj:
+ $(cc32) -DBUILD_tcl $(TCL_CFLAGS) -Fo$(TMPDIR)\ $<
+
+{$(EXPAT_PARSE_DIR)}.c{$(TMPDIR)}.obj:
+ $(cc32) -DBUILD_tcl $(TCL_CFLAGS) -Fo$(TMPDIR)\ $<
+
+{$(EXPAT_TOK_DIR)}.c{$(TMPDIR)}.obj:
+ $(cc32) -DBUILD_tcl $(TCL_CFLAGS) -Fo$(TMPDIR)\ $<
+
+{$(GENERICDIR)}.c{$(TMPDIR)}.obj:
+ $(cc32) -DBUILD_tcl $(TCL_CFLAGS) -Fo$(TMPDIR)\ $<
+
+{$(WINDIR)}.rc{$(TMPDIR)}.res:
+ $(rc32) -fo $@ -r -i $(GENERICDIR) -i $(WINDIR) -D__WIN32__ \
+ $(TCL_DEFINES) $<
+
+clean:
+ -@$(RM) $(OUTDIR)\*.exp
+ -@$(RM) $(OUTDIR)\*.lib
+ -@$(RM) $(OUTDIR)\*.dll
+ -@$(RM) $(OUTDIR)\*.exe
+ -@$(RM) $(OUTDIR)\*.pdb
+ -@$(RM) $(TMPDIR)\*.pch
+ -@$(RM) $(TMPDIR)\*.obj
+ -@$(RM) $(TMPDIR)\*.res
+ -@$(RM) $(TMPDIR)\*.exe
diff -u -r -N tDOM-0.4-orig/win/pkgIndex.tcl tDOM-0.4/win/pkgIndex.tcl
--- tDOM-0.4-orig/win/pkgIndex.tcl Thu Jan 01 01:00:00 1970
+++ tDOM-0.4/win/pkgIndex.tcl Tue Apr 25 23:07:03 2000
@@ -0,0 +1,5 @@
+# tDOM Tcl package index file
+
+package ifneeded tdom 0.4 \
+ "source [file join $dir tdom.tcl]; \
+ load [file join $dir tdom04.dll] tdom "
o Added the axis "following" and "preceding" in xpathEvalStep. Befor
that, every step starting with following:: and preceding:: selected
the emtpy nodeset.
o Corrected the axis name "preceeding" to "preceding"
o The preceding axis selected nodes in reverse document order. The
spec requires the selected nodes to be in document order (see
2.2). Fixed to be compliant to the spec.
o Added a test in xpathNodeTest to prevent seg faults under some
(crude) circumstances. For example try //following::* But there is
still another (probably the "main") problem. (For example check the
output of the expr above).
o //* should be parsed as AXISNAME WCARDNAME, instead of AXISNAME
MULTIPLY -- fixed.
rolf
On 28 Apr, Steve Ball wrote:
> rolf@... wrote:
>> On 27 Apr, Steve Ball wrote:
>> > I haven't seen any messages come through on this list yet,
>> > so I thought I'd better kick things off ;-)
>>
>> Well, this is a small mailing list in it's first days, without the
>> heavy load of tclxml.. ;-)
>
> You think the TclXML mailing list has a heavy load?
My comment was a little bit ironic. I thought, that's obvious, sorry
for that - I shouldn't kidding in a foreign language.
>> > Firstly, having read (earlier) versions of the tDOM source
>> > code I'm not convinced that tDOM is actually implementing
>> > *the* DOM. That is, it is *a* DOM, but not one that is
>> > compliant with the W3C DOM spec. My particular beef is with
>> > live node lists and named maps (attributes); tDOM appears
>> > to support only static lists (what is the return value
>> > of the childNodes attribute for a Node?).
>>
>> That's a nice detail.[...]
>
> That is my point. If tDOM does not implement the DOM spec
> then it should be renamed to avoid confusion.
Come on, Steve, this time you're kidding, aren't you? OK, your
observation is perfectly true: tDOM implements at the moment only
_almost_ all, not all DOM API's. That's why it's versioned 0.4, isn't
it?
But let us look a bit closer at this detail - to call it "point"
sounds too big within my ears. The DOM specification says about the
Node interface attribute childNodes:
"The content of the returned NodeList is "live" in the
sense that, for instance, changes to the children of the
node object that it was created from are immediately
reflected in the nodes returned by the NodeList
accessors; it is not a static snapshot of the content of
the node."
Since the spec requires it, a fully compliant DOM implementation must
support this as described above - of course. But from a practical
viewpoint of things this DOM feature isn't a burning need at the start
of an implementation. In tDOM simply save your node object of interest
and do your "childNodes" call(s) exactly at the moment, you need
it. Really no practical problem after all.
But to give your soul rest and peace you can get a somewhat fully
compliant behavior within a few minutes. As noticed within the tDOM
documentation: "If an unknown method name is given [to a tDOM node
obj], the command with the same name as the given method within the
namespace ::dom::domNode is tried to be executed." Let's type
proc ::dom::domNode::DOMchildNodes {node} {
return $node
}
proc ::dom::domNode::item {node index} {
set childNodes [$node childNodes]
return [lindex $childNodes $index]
}
proc ::dom::domNode::length {node} {
set childNodes [$node childNodes]
return [llength $childNodes]
}
source' it in and the hole thing should almost be done. [domNode
DOMchildNodes] gives you back not a "static list" but a "NodeList"
object . This "NodeList" object understands the method item and the
attribute length, as requested by the DOM spec.
And you're really serious about your request to rename tDOM because of
this?
>> [...]. tDOM is getting me my job done _yet_.
>
> That's fine. Alternative implementations with interesting
> characteristics are good. Competing implementations with incompatible
> APIs are bad. [...]
> Something like this would be good:
>
> # Choose only one of these:
> package require dom ;# generic case, loads best available pkg
> package require tcldom ;# Makes Tcl-only DOM default implementation
> package require xerces ;# Makes Xerces DOM the default
> package require tdom ;# Makes tDOM the default
>
> # Everything is the same from now on
> [etc.]
It's nice to see you seperating good from bad. Are you able to respect
my decision not to invest energies in building some endless groundwork
again and again and are you anyhow willing to talk with me sometimes
in the future, if you're right for this time? There are so much other
things to do. Look, for example, at http:://www.xml.com They have a
living news section and a big, highly organized news archive, with
really a couple of categories, including of course java, and also perl
and python - but not tcl. There are so much other things to do instead of
the "Wy don't you support my project"-discussion.
This are of course my personal opinions. Other people may comment your
initiative with arms-wide-open.
greetings
rolf
rolf@... wrote:
> On 27 Apr, Steve Ball wrote:
> > I haven't seen any messages come through on this list yet,
> > so I thought I'd better kick things off ;-)
>
> Well, this is a small mailing list in it's first days, without the
> heavy load of tclxml.. ;-)
You think the TclXML mailing list has a heavy load?
I only wish I had that problem! I thought traffic was pretty
light and the list is quite capable of carrying the tDOM
traffic, at least for now. Never mind, I'm quite happy to
be subscribed to both.
> > Firstly, having read (earlier) versions of the tDOM source
> > code I'm not convinced that tDOM is actually implementing
> > *the* DOM. That is, it is *a* DOM, but not one that is
> > compliant with the W3C DOM spec. My particular beef is with
> > live node lists and named maps (attributes); tDOM appears
> > to support only static lists (what is the return value
> > of the childNodes attribute for a Node?).
>
> That's a nice detail. I have to confess, that I haven't noticed this
> until now, thanks for that. I needed some kind of DOM-like access to
> XML data and was happy to find this one. Of course, a more standard
> conform implementation is, well, more standard conform.
That is my point. If tDOM does not implement the DOM spec
then it should be renamed to avoid confusion. It's OK for the
toolkit to not be DOM compliant - plenty of people
are using exCoST for XML document processing, and are perfectly
happy for it to not be DOM compliant. But CoST doesn't *claim*
to be DOM compliant (it was around long before DOM came along).
> > Secondly, what capabilities or features can tDOM offer that
> > aren't available in Xerces-C? I'd rather piggy-back off the
> > Xerces effort, leveraging it/value-adding with a Tcl wrapper,
> > than try to maintain our own C/C++ DOM implementation.
> > I undertand that Xerces wasn't around when tDOM started,
> > but what are the goals of tDOM at this point?
>
> I understand you emphasis about the start of Xerces integration. To
> have a good, complete, feature-rich parser, supported by his own
> community, and to maintain only the Tcl Interface to it would be a
> Good Thing to have.
>
> But at the moment, I see only a promise, as some others in the history
> of Tcl XML solutions. An early version of your newest child hasn't
> worked for me, probably the current release does it better.
Well, I'm only one person and I can only hack code so fast.
I've tried to demonstrate the possibilities, and hope that
others will help in fleshing out the implementation.
> There are of course reasons pro tDOM. tDOM includes a xpath
> implementation.
Yes, XPath is good. Now if it was built upon a standard TclDOM API
so that others could share it, that would be excellent. See below.
> tDOM is fast. In a mail on the tclxml mailinglist in
> this days you mentioned:
>
> "When I got the TclXerces DOM implementation first working I
> did a quick benchmark. I used a small script that creates
> 10000 nodes. TclXerces was about 25 times faster than TclDOM."
>
> Well, no big deal. As you and I know TclDOM is, well, sloooow (and
> needs a loooot of memory). I've done also a few tests. tDOM was _5
> times_ faster than Xcerces. There are also some pramatic reasons. I
> have a tDOM win32 dll. tDOM is getting me my job done _yet_.
That's fine. Alternative implementations with interesting
characteristics are good. Competing implementations with incompatible
APIs are bad. My recent work on TclXerces (DOM) adds some
tDOM-style constructs (like making tree nodes Tcl commands, OO style).
Why can't we merge the Tcl-level TclDOM and tDOM APIs so that
Tcl application developers can get on with the job and higher-level
packages can be written (like SOAP, XML-RPC, XSLT, etc)?
It would be great if Swish could use tDOM as its DOM package,
but the incompatible API prevents that. I find that really annoying.
I would be perfectly happy for TclDOM to develop a layered
structure as TclXML has - a generic Tcl API layer, with implementation-
specific layers underneath. With TclXML you can even choose between
them:
parse one document with expat and another with Xerces-C.
Something like this would be good:
# Choose only one of these:
package require dom ;# generic case, loads best available pkg
package require tcldom ;# Makes Tcl-only DOM default implementation
package require xerces ;# Makes Xerces DOM the default
package require tdom ;# Makes tDOM the default
# Everything is the same from now on
set myDoc [dom::create] ;# uses the default implementation
$myDoc createElement MyElement
... etc ...
# Create DOM trees using a particular package
set altDoc [dom::create -implementation tcl]
set altDoc2 [dom::create -implementation xerces]
Cheers,
Steve Ball
--
Steve Ball | Swish XML Editor | Training & Seminars
Zveno Pty Ltd | Web Tcl Complete | XML XSL
http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development
Steve.Ball@... +-----------------------+---------------------
Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099
Hi, Steve
nice to see one of the rare public visible Tcl'nXML guys on the list.
On 27 Apr, Steve Ball wrote:
> Hi All,
>
> I haven't seen any messages come through on this list yet,
> so I thought I'd better kick things off ;-)
Well, this is a small mailing list in it's first days, without the
heavy load of tclxml.. ;-)
>
> Two things are bothering me about tDOM, so my first message
> is going to be controversial. I sincerely hope that Jochen,
> or others, will disprove my concerns.
Jochen is on vacation, as far as I know, so don't hold your breath for
his comments. My thoughts and experiences are very personal. They
probably don't matter, but you have asked...
> Firstly, having read (earlier) versions of the tDOM source
> code I'm not convinced that tDOM is actually implementing
> *the* DOM. That is, it is *a* DOM, but not one that is
> compliant with the W3C DOM spec. My particular beef is with
> live node lists and named maps (attributes); tDOM appears
> to support only static lists (what is the return value
> of the childNodes attribute for a Node?).
That's a nice detail. I have to confess, that I haven't noticed this
until now, thanks for that. I needed some kind of DOM-like access to
XML data and was happy to find this one. Of course, a more standard
conform implementation is, well, more standard conform.
> Secondly, what capabilities or features can tDOM offer that
> aren't available in Xerces-C? I'd rather piggy-back off the
> Xerces effort, leveraging it/value-adding with a Tcl wrapper,
> than try to maintain our own C/C++ DOM implementation.
> I undertand that Xerces wasn't around when tDOM started,
> but what are the goals of tDOM at this point?
I understand you emphasis about the start of Xerces integration. To
have a good, complete, feature-rich parser, supported by his own
community, and to maintain only the Tcl Interface to it would be a
Good Thing to have.
But at the moment, I see only a promise, as some others in the history
of Tcl XML solutions. An early version of your newest child hasn't
worked for me, probably the current release does it better.
There are of course reasons pro tDOM. tDOM includes a xpath
implementation. tDOM is fast. In a mail on the tclxml mailinglist in
this days you mentioned:
"When I got the TclXerces DOM implementation first working I
did a quick benchmark. I used a small script that creates
10000 nodes. TclXerces was about 25 times faster than TclDOM."
Well, no big deal. As you and I know TclDOM is, well, sloooow (and
needs a loooot of memory). I've done also a few tests. tDOM was _5
times_ faster than Xcerces. There are also some pramatic reasons. I
have a tDOM win32 dll. tDOM is getting me my job done _yet_.
> Don't get me wrong: I don't think tDOM is a bad thing.
> As I said above, I will be the happiest person around if
> my concerns expressed above are shown to be invalid.
Geb't Gedankenfreiheit, Sire.
rolf
Hi All,
I haven't seen any messages come through on this list yet,
so I thought I'd better kick things off ;-)
Two things are bothering me about tDOM, so my first message
is going to be controversial. I sincerely hope that Jochen,
or others, will disprove my concerns.
Firstly, having read (earlier) versions of the tDOM source
code I'm not convinced that tDOM is actually implementing
*the* DOM. That is, it is *a* DOM, but not one that is
compliant with the W3C DOM spec. My particular beef is with
live node lists and named maps (attributes); tDOM appears
to support only static lists (what is the return value
of the childNodes attribute for a Node?).
Secondly, what capabilities or features can tDOM offer that
aren't available in Xerces-C? I'd rather piggy-back off the
Xerces effort, leveraging it/value-adding with a Tcl wrapper,
than try to maintain our own C/C++ DOM implementation.
I undertand that Xerces wasn't around when tDOM started,
but what are the goals of tDOM at this point?
Don't get me wrong: I don't think tDOM is a bad thing.
As I said above, I will be the happiest person around if
my concerns expressed above are shown to be invalid.
Cheers,
Steve Ball
--
Steve Ball | Swish XML Editor | Training & Seminars
Zveno Pty Ltd | Web Tcl Complete | XML XSL
http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development
Steve.Ball@... +-----------------------+---------------------
Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099
Just forwarding some interesting work from Zoran:
Hallo Jochen !
Danke fuer Dein Antwort.
Wir koennen natuerlich auf Deutsch weiter reden ...
Ich habe mich uebers Wochenende ein wenig mit tDOM beschaeftigt.
Da gibt es noch ein mem-leak in "dom parse -simple".
Was schief laeuft sind die PI's, und zwar nur das
erste <?xml ...?>, besser gesagt, alle die ausserhalb
documentElement deklariert sind. Ich habe es noch nich korrigiert
aber ich werde es diese Woche machen. expat zeigt keine leaks.
Ich habe alles mit Purify 4.5 auf Solaris 2.6 durchgecheckt.
MT: habe tDOM auf die globals/statics untersucht.
Da gibt es einige, aber nicht viele. Manche wuerde ich in TSD packen
manche global lassen und mit mutex versehen. Somit
kann man sich auch "MT-safe" auf die Flagge schreiben :)
Ich habe einige Erfahrungen mit AOLserver (http://www.aolserver.com)
und wuerde tDOM anpassen dass es sich als Modul bei AOLserver
anmelden kann. Dies erfordert mehr Aufwand da ich 100%-tig sicher
sein muss dass die Sache mem-leak frei ist und Speicher sauber
bleibt wenn connection-thread terminiert wird.
Wie ich in meinem letzen Mail gesagt habe, werde ich tDOM
als Werkzeug fuer programatische HTML Erstellung verwenden.
Wir haben schon vor 5 Jahren ein in-house tool geschreiben
was Funktionalitaet der ActiveServerPages, PHP, etc, etc
beinhaltet, haben wir es aber nie public gegeben. Damit
sind eine Menge Anwendungen geschreiben. Die Erfahrung
hat's aber geziegt dass mischen von HTML und Programsprache
fur einfachere Sachen zwar akzeptabel ist, bei komplexeren
verliert man aber den Ueberblick total !
Um sowas machen zu koennen, muss ich tDOM modifizieren
so dass u.a. leere Knoten anders ausgegeben werden:
<IMG src="some-url"/> -> <IMG src="some-url" />
Dies wird auch von XHTML 1.0 standard empfholen.
(siehe http://www.w3.org/TR/xhtml1/#guidelines)
Mann kann sowas unter "domNode asHTML" unterbringen. Hm ?
Ich habe n'paar Stunden investiert und ein thin-layer a'la cgi.tcl
ueber tDOM gemacht. Schau mal hinein... Es ist wirklich
elegant und einfach... Es ist zwar noch kein richtiges HTML-DOM
aber man kann es, so wie's ist, sehr gut gebrauchen...
Anbei als attachment ...
Als vorschlag: "domNode setAttribute key value" sollte
(aus Performance gruenden) multiple key/value Paare erlauben:
"domNode setAttribute key1 value1 ?key2 value2 ...?"
Dies verletzt den DOM nicht und macht die Sache einfacher
wenn mann multiple Attribute gleichzeitig setzen moechte.
Siehe attached domhtml.tcl fuer Beispiel...
Ok. So, die Zusammenfassung von Dingen wo ich helfen kann:
1. Aenderungen bei Tcl overloading. Zuerst wird die methode
unter dom::domNode::<nodename>::<method> gesucht. Wenn keine
gefunden dann dom::domNode::<method> (siehe mein erstes Mail...)
2. Alternative "domNode asHTML" Ausgabe (siehe oben)
3. MT Anpassung, kompatibel mit Tcl8.1+ und
mit notwendigen #ifdef's fuer < 8.1.
4. AOLserver 3.0 Anpassung, sodass tDOM als Modul
geladen werden kann. Kann ich mit ein paar
#ifdef NS_AOLSERVER versehen ...
Wenn ich zwischenzeitlich auf irgendwelche Probleme stosse,
oder mir noch was einfaelt, melde ich mich bei Dir.
Nochmals, vielen Dank fuer Dein excellentes package !!!!
Wir hoeren uns,
Zoran Vasiljevic
QuarkSolutions GmbH
Munich, Germany.
P.S. domhtml.tcl liegt als attachment. Dies ist KEIN richtiges
DOM-HTML aber man kann es sehr gut gebrauchen. Schau mal rein!
Just forwarding interesting work from Rolf -- sorry it is in
German:
Hallo Jochen;
> Warum nicht XPath nutzen? [...]
[Komplett aus dem Zusammenhang gerissen, 'schuldigung. Aber es
paßt so
schön als Einstieg.] Mit dem tDOM xpath habe ich gerade meine
Schwierigkeiten. xpath ist eine wunderbare Sache, immerhin derzeit,
wenn ich richtig auf dem Laufenden bin, die einzige "offiziell" fertig
standardisierte Query-Language für XML-Daten. Du mußtest deine
Query-Language noch selber schreiben, aber du mußtest dir nicht
mehr
den Kopf über das Design der Query-Language zerbrechen. Ich bekomme
sie bereits geschenkt, von dir, und brauche sie eh für XSLT.
Der einzige Knackpunkt ist, daß alle xpath verstehenden Tools auf
in-memory Datenstrukturen arbeiten und nichts persistente
Datenstrukturen anlegen, die jederzeit unmittelbar mit xpath abfragbar
sind (bis auf ein sagenumwobenes Tool der GMD, das ich vielleicht noch
in die Hände bekomme). Um mich herum schlagen bereits die ersten 60
MByte großen XML-Dateien mit Nutzdaten für das alltägliche
Geschäft
ein und da ist nach oben erst mal kein Halten, am Frauenhofer, einmal
quer über die Straße 'rüber, fabulieren sie schon von einem
Terabyte
großen XML-Datenpool, spätestens in einem Jahr (und wir sollen's
machen, hm).
Alldieweil ich das erwarte, verkürze mir die Wartezeit darauf indem
ich mit Hilfe von tDOM eine "virtuelle" domain in Brent Welsch's tcl
httpd reinbastle, die direkten Zugriff auf XML Dateien (ohne weitere
Zwischenschicht) bietet, eine Art simple minded XML-Viewer.
Dabei bin ich auf ein Problem gestoßen.
Dieses XML-Dokument
<root>
<a>a</a>
<b>b</b>
<c>
<cc>cc</cc>
</c>
<d>d</d>
</root>
(oder wohl jedes beliebige andere XML-Dokument mit mehr als einem
Child direkt unterhalb des root-Elements) liefert, von diesem Skript
set fd [open demo.xml]
set xmlstring [read $fd]
close $fd
set xmldoc [dom parse $xmlstring]
set root [$xmldoc documentElement]
$root childNodes
puts [domNode2 selectNodes following::*]
traktiert, bei mir einfach nichts, anstatt eben alle Tags nach
a. Dieses Stylesheet
<?xml version='1.0' encoding='ISO-8859-1'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="/root/a"/>
</xsl:template>
<xsl:template match="/root/a">
<xsl:for-each select="following::*">
<xsl:value-of select="name()"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
zum Beispiel gibt mir (zumindest mit xt und saxon und inzwischen auch
mit xalan, der zunächst nicht wollte) ein erwartetes "bcccd".
[Einen Abend später] Inzwischen bin ich weiter. Die beiden axes
following und preceding werden in xpathEvalStep einfach
stillschweigend übergangen und liefern deswegen einfach immer nur
ein
leeres nodeset. (Was so ganz stillschweigend eben doch schon zu
Verwirrung beim User führen kann, immerhin liefert zum Beispiel
"id()"
immerhin ein "not yet implemented".)
[Nebenbei: Gleich zu Beginn, als die tDOM Welt für mich neu, rund
und
spannend war, schaute ich mir schon den Code an. Zumal den
aufgeräumten xpath Code fand ich todschick, auch mit seinen
zahlreichen defines. Als ich dann begann, wirklich im Code
rumzuwühlen, begann ich allerdings auch schnell das Fluchen - ich
fand
es schwieriger, den Datenfluß zu verfolgen, mußte Typen
Definitionen
an ganz andere Stelle suchen, mußte mich erst zurechtfinden.
Inzwischen finde ich den Ansatz wieder schick; die Lernkurve, um in
den Code einzudringen, ist wohl steiler, aber nicht sonderlich lang
und anschließend geht alles, auch dank des DBG Systems, doch
entschieden leichter von der Hand.]
Beim rumwühlen bin ich dann gleich noch auf einen "trivialen" Bug
gestolpert, der doch gleichzeitig noch ein kleiner show-stopper
ist. Im gesamten domxpath code verwendest du "preceeding", auch in
xpath Ausdrücken wird nur diese Schreibweise erwartet. In der xpath
rec 1.0 vom 16. November 1999 werden als AxisName aber preceding und
preceding-sibling aufgezählt. (Und das ist so trivial zu fixen,
daß
ich mir den patch sparen kann.)
Wie man eben so rumprobiert, wenn man versucht, zu verstehen, bin ich
auch über
puts [domNode2 selectNodes //following::*]
gestolpert (zum Beispiel auf die obigen XML Daten losgelassen), was
bei mir segfault liefert.
Das ganze jetzt einfach mal schneller und kürzer:) Und noch ein
paar
Kleinigkeiten mehr, auch mehr oder weniger Triviales wie
stillschweigende Konvertierungen von double auf int, int auf pointer
etc. Wie ist denn dein Stand? Du hast von ein paar Bugfixes in xpaht
seit dem letzten Release geschrieben; bring' mich doch bitte auf den
Stand deiner Dinge. Falls du die beiden fehlenden axes bereits gecodet
hast: wunderbar, ansonsten setzte ich mich wahrscheinlich "g'schwind"
hin.
Es hat einiges konsequentes try and error gebraucht - mein know-how
bezüglich der MS Entwicklungswerkzeuge auf win Plattform ist
einfach
begrenzt - aber ich _habe_ inzwischen eine tdom04.dll, die sich
einfach so in ein Scriptics tclsh8.3 binary "load"en läßt und wohl
einfach auch funktioniert. Am Code waren nur wenige Änderungen
nötig. Ein strcasecmp zu strcmp (nach einem Windows-pedant habe ich
noch nicht gesucht) und einige defines, die compilieren und ladbarkeit
ermöglichen. Das Makefile will ich noch etwas säubern (trotz des
unvermeidlichen dicken disclaimers drin: ich versteh' davon gar
nichts) und das .dll file wollte ich nicht einfach so an eine Mail an
einen hotmail.com account hängen - ich weiß gerade einfach
nicht, ob
die Attachements mit 180 k tolerieren. Kurze Anforderung genügt und
ich schicke dir das Ding (und, wenn mich das Leben ungeschüttelt
läßt,
nicht viel später das Makefile).
Vor einigen Tagen ist XML::Parser 2.28 erschienen. Die entscheidende
Neuerung dabei ist, daß Clark Cooper die Erweiterungen für DTD
Event
Handler in expat hineingepatched hat. Diesen aufgebohrten expat habe
ich genommen und unter tclxml geklemmt (nuja, zu 80% sind die Cooper
handler unter tcl bereits verfügbar). Auch wenn ich das Interface
(und
die Implementierung) nicht in allen Belangen für 100% gelungen
halte -
der AttlistDeclHandler zum Beispiel wird, bei der Definition mehrerer
Attribute in einer ATTLIST Deklaration, für jedes Attribut einzeln
aufgerufen - hat der Gedanke, einen aufgebohren expat mit den Perl
Leuten zu sharen, einen gewissen Reiz - größere Userbasis, mehr
Entwickler und die Tcler haben ja mit deinem C xpath Code wirklich
auch was zu bieten.
>> > Nun meine Idee bzg. DTD/Schema ist, Expat weiter aufzuboren,
>> > so dass die DTD Infos in C Datenstrukturen gehalten werden und
>> > _offiziel von aussen_ abgreifbar sind. Geparsed werden DTD
>> > Element schon komplett intern. Die Daten werden nur nirgends
>> > aufgehoben. Scriptics hat schon an der Ecke was gemacht und ich
>> > habe damit begonnen dies zu erweitern, um ueber Attribute
>> > mehr zu erfahren.
>> > ...
>> > Hilfe, auch nur in Form von
>> > Design-Vorschlaegen ist natuerlich sehr willkommen.
>> > Allerdings habe ich in den letzten zwei/drei Wochen noch einige
>> > kleine XPath Bugs gefixt. Vielleicht fuehrt dies zu
>> > tDOM-0.4 (final) mit DTD Erweiterungen.
(Da ist er ja, der Satz mit den kleineren xpath fixes.) Insgesamt sind
DTDs auf dem absteigenden Ast (hoffe ich zumindest), in Zukunft wird
man Metadaten zu XML-Daten als XML Daten halten (hoffe ich). Aber bis
dahin gehen noch einige Monate ins Land. Richtig ist, daß die
Datenstrukturen von aussen abgreifbar sein müssen.
>>Ein Fernziel ist Validierung; um irgendwo anzufangen, begann ich,
DTD
>>Unterstützung einzubauen. Dazu habe ich TclExpatInfo um ein Flag
>>"validate" und die Datenstruktur dtdInfo erweitert (plus
entsprechende
>
> Hmm, dtdInfo also liegt ausserhalb von Expat in TclExpat !?
Ja, das realisierte ich schnell und blieb deshalb frühzeitig
stecken;
meine Gedanken entwickelten sich in andere, damals auch geschilderte
Richtung.
> AttlistDecl ist dass vorhandene Expat/Scriptics AttlistDecl
callback,
> nicht wahr? Es liefert nur Attribute Namen aber keine Detailinfo
> wie Enumerations? [..]
Das ist richtig. Coopers Erweiterungen liefern immerhin alle ATTLIST
Informationen.
>>Sehr viel mehr beschäftigt mich die Frage, ob ich nicht ein tck
XML
>>parser Kommando haben kann, dem ich sagen, es soll, während er
die
ihm
>>verfütterten XML Daten parsed, auch gegen eine DTD validieren,
oder
>>während des parsens nebenbei einen hinterher dann zum Beispiel
über
>>das tDOM Interface benutzbaren DOM-Baum aufbauen, oder auch gegen
ein
>>XML Schema validieren und der dabei trotzdem auf die entsprechenden
>>events zusätzlich noch meine tcl Skripte ausführt. Vage in
etwa ein
>>Modell, wie es der apache httpd optional in ihn geladenen Modulen
>>erlaubt, handler zu registrieren. Dazu müßte tclexpat alle
denkbaren
>>handler definieren und dabei einen Registriermechanismus
installieren,
>>der es, auf entsprechenden Konfigurationsrequest, erlaubt,
zusätzliche
>>handler für die von expat zur Verfügung gestellten
Basisevents zu
>>installieren (die dann, wenn der entsprechende event auftritt, der
>>Reihe nach aufgerufen und mit den gleichen Daten gefüttert
werden).
>
> Deine Ueberlegungen sprechen dafuer die DTD-Datenstrukturen in
> Expat mit einzubauen und auch den spaeteren Validierungs-Code
> TclExpat/tDOM unabhaengig zu implementieren. Es ist einfach
> ein zusaetzlicher Service von XML-Parser. Trotzdem
> sollte die Schema-Info (DTD ist subset) nach aussen gegeben
> werden koennen ueber eine DOM aehnlich Schittstelle, um das
> Schreiben von Meta-Tools/XML-DTD-Compiler zu ermoeglichen.
Mich hat die Idee, einfach verschiedene handler bei einem event zu
registrieren, immer noch nicht ganz losgelassen. Im Prinzip: der eine
Handler, der sich bei expat für ein event definieren läßt,
arbeitet
einen Array mit Funktionspointern ab und ruft jede handler Funktion
mit den von expat gelieferten Daten auf. In Code:
/*
*-------------------------------------------------
----------------------
*
* GexpatElementStartHandler --
*
* Called by gexpat for each start tag.
*
* Results:
* None.
*
* Side Effects:
* Every function registered for this event is called with its
* own user data and the handler specific data.
*
*-------------------------------------------------
-----------------------
*/
static void
GexpatElementStartHandler(userData, name, atts)
void *userData;
const char *name;
const char **atts;
{
GexpatInfo *gexpat = (GxpatInfo *) userData;
HandlerSet handlerset;
GexpatDispatchPCDATA(gexpat);
if (gexpat->status == TCL_CONTINUE) {
/*
* We're currently skipping elements looking for the
* close of the continued element.
*/
gexpat->continueCount++;
return;
}
handlerset = gexpat->nextHandlerSet;
while (handlerset != NULL) {
if (handlerset->elementStartHandler != NULL)
(*(handlerset->elementStartHandler)) (userData, name,
atts);
handlerset = handlerset->nextHandlerSet;
}
TclExpatHandlerResult(expat, result);
return;
}
mit einem ganz banalen
/*------------------------------------------------
---------------------
Every event "attribute" has the complete set of event handler
functions (but must not necessary use any handler slot; unused
slots are perfectly OK without additional effort.)
\-------------------------------------------------
---------------------*/
typdef struct handlerSet
{
void *privateData;
handlerset *nextHandlerSet;
XML_StartElementHandler startElementHandler;
XML_EndElementHandler elementEndHandler;
XML_StartNamespaceDeclHandler startNamespaceDeclHandler;
XML_EndNamespaceDeclHandler endNamespaceDeclHandler;
XML_CharacterDataHandler characterDataHandler ;
XML_ProcessingInstructionHandler processingInstructionHandler;
XML_DefaultHandler defaultHandler;
XML_UnparsedEntityDeclHandler unparsedEntityDeclHandler;
XML_NotationDeclHandler notationDeclHandler;
XML_ExternalEntityRefHandler externalEntityRefHandler;
XML_UnknownEncodingHandler unknownEncodingHandler;
XML_CommentHandler commentHandler;
XML_NotStandaloneHandler notStandaloneHandler;
XML_StartCdataSectionHandler startCdataSectionHandler;
XML_ElementDeclHandler elementDeclHandler;
XML_AttlistDeclHandler attlistDeclHandler;
XML_StartDoctypeDeclHandler startDoctypeDeclHandler;
XML_EndDoctypeDeclHandler endDoctypeDeclHandler;
} HandlerSet;
Die nicht ganz unerhebliche Editierarbeit habe ich schon vor Wochen
ziemlich weit getrieben, am eigentlichen Code (handler registieren,
löschen etc.) fehlt es noch. Ein Hauptproblem dabei ist für
mich, ein
vernüftiges Interface für dieses im Prinzip einzige
zusätzliche Tcl
object [expat parser] zu finden. Bei deinem tDOM in diesem Modell zum
Beispiel würden nämlich nicht so sehr die durch tDOM
registierten
handler einen für den Benutzer sichtbaren Effekt machen (sie
arbeiten
gewissermaßen stillschweigend im Hintergrund). parser Objekte
bekäme
einfach einen an- und abschaltbaren Seiteneffekt: nachdem sie
XML-Daten geparst haben, gibt es einen DOM-Baum dieser Daten. Also
abhängig von an- und abschaltbaren Eigenschaften bekommt die parser
methode "parse" unterschiedlich viele und unterschiedlich bedeutende
Rückgabewerte. Aber irgendwo ist diese Architektur schon ziemlich
Turmbau von Babel artig.
Lebst du noch, nach soviel Zeilen?
Gruesse
rolf