Skip to search.

Breaking News Visit Yahoo! News for the latest.

×Close this window

aalto-xml-interest · Aalto XML Parser (stax)

The Yahoo! Groups Product Blog

Check it out!

Group Information

  • Members: 17
  • Category: XML
  • Founded: Feb 2, 2008
  • Language: English
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Hear how Yahoo! Groups has changed the lives of others. Take me there.

Messages

Advanced
Messages Help
Messages 27 - 56 of 75   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Show Message Summaries Sort by Date ^  
#27 From: "lfs_neves" <lfs_neves@...>
Date: Mon Feb 2, 2009 4:56 pm
Subject: WS testing (was Re: Re: Interesting Aalto reference, linux+jibx+aalto...)
lfs_neves
Send Email Send Email
 
--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta
<tsaloranta@...> wrote:

> Given above, it might be quite easy to implement json-based web
> service, where data binding is done using Jackson instead of JAXB. I
> would expect such a service to be still faster than Jibx
>
> What do you think?

It sounds cool, I was thinking something similar but using Google
Protocol Buffers, I might as well test json.
It would make a nice comparison.

I will try to make something soon.

Regards.

--
Luis Neves

#28 From: Tatu Saloranta <tsaloranta@...>
Date: Mon Feb 2, 2009 5:53 pm
Subject: Re: WS testing (was Re: Re: Interesting Aalto reference, linux+jibx+aalto...)
cowtowncoder
Send Email Send Email
 
On Mon, Feb 2, 2009 at 8:56 AM, lfs_neves <lfs_neves@...> wrote:
> --- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta
>
> <tsaloranta@...> wrote:
>
>> Given above, it might be quite easy to implement json-based web
>> service, where data binding is done using Jackson instead of JAXB. I
>> would expect such a service to be still faster than Jibx
>>
>> What do you think?
>
> It sounds cool, I was thinking something similar but using Google
> Protocol Buffers, I might as well test json.
> It would make a nice comparison.

Cool! I have tested PB earlier, and it's quite easy. The main
challenge was that it's a bit of apples & oranges, given how tightly
coupled PB is. Messages are not self-contained (without schema you
have little idea what data is about, since integer codes are used for
message types), and you can't really bind data to other objects,
AFAIK you must use objects PB generates.
That's ok as long as test framework doesn't have problems with it --
in my case it was bit problematic, but I was able to try it out by
refactoring code. Or you can wrap PB objects with beans, although
that's akin to writing a data binding lib of your own. :-)

But it would be very interesting to see how different formats & libs compare!
So let me know how things work.

-+ Tatu +-

#29 From: Tatu Saloranta <tsaloranta@...>
Date: Thu Feb 5, 2009 5:58 am
Subject: Version 0.9.4 released: now Aalto is a COMPLETE stax 1.0 implementation!
cowtowncoder
Send Email Send Email
 
After finishing the namespace-repairing mode for stream writers,
implementing coalescing mode, and ensuring that both pass 100% with
existing staxtest and stax2test unit test suites, it is time for one
of last pre-1.0 releases.

At this point, the main thing that remains to be wrapped up is the
non-blocking parser, which is mostly functional but lacks following:

(a) API extension/alternative over Stax, since Stax does not cover
non-blocking cases
   * constructing non-blocking parsers
   * feeding content (can't use input stream or reader, since they are blocking)
(b) Implementation of bootstrapping (auto-detection of encoding,
parsing of xml declaration).

and of course some documentation regarding non-blocking API.

But for blocking use cases (all existing Stax, stax2 use cases), Aalto
is getting rather ready for production use!

-+ Tatu +-

#30 From: "lfs_neves" <lfs_neves@...>
Date: Sat Feb 7, 2009 11:07 am
Subject: Re: Version 0.9.4 released: now Aalto is a COMPLETE stax 1.0 implementation!
lfs_neves
Send Email Send Email
 
--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta
<tsaloranta@...> wrote:
>
> After finishing the namespace-repairing mode for stream writers,
> implementing coalescing mode, and ensuring that both pass 100% with
> existing staxtest and stax2test unit test suites, it is time for one
> of last pre-1.0 releases.

I'm getting a 404:
/hatchery/aalto/0.9.4/aalto-gpl-0.9.4.jar was not found on this server.

--
Luis Neves

#31 From: Tatu Saloranta <tsaloranta@...>
Date: Sat Feb 7, 2009 5:21 pm
Subject: Re: Re: Version 0.9.4 released: now Aalto is a COMPLETE stax 1.0 implementation!
cowtowncoder
Send Email Send Email
 
My bad -- jars were copied one directory too high. Should work now,

-+ Tatu +-

On Sat, Feb 7, 2009 at 3:07 AM, lfs_neves <lfs_neves@...> wrote:
> --- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta
>
> <tsaloranta@...> wrote:
>>
>> After finishing the namespace-repairing mode for stream writers,
>> implementing coalescing mode, and ensuring that both pass 100% with
>> existing staxtest and stax2test unit test suites, it is time for one
>> of last pre-1.0 releases.
>
> I'm getting a 404:
> /hatchery/aalto/0.9.4/aalto-gpl-0.9.4.jar was not found on this server.
>
> --
> Luis Neves
>
>

#32 From: Tatu Saloranta <tsaloranta@...>
Date: Sat Feb 7, 2009 5:25 pm
Subject: Quick note: Aalto rocks Xalan, Saxon (via SAX)
cowtowncoder
Send Email Send Email
 
Quick note: I am doing performance testing, to see how fast Aalto
works when used as SAX replacement for Xalan and Saxon.
This is easy to do: Aalto has JAXP factory and parser implementations
under "org.codehaus.wool.sax"; just construct a SAXParserFactoryImpl,
and go from there.

Initial results are very encouraging: compared to Xerces (2.9.1),
Woodstox is bit faster, but Aalto is similar bit faster than Woodstox.
Obviously there's more overhead with xslt processing than just xml
parsing, but I think 30-40% performance boost with a simple jar change
sounds pretty good to me.

I hope to publish these (and other) results in near future, but
thought I'll give a quick preview at this point; I think results
themselves are sound, just need to polish presentation aspects.
Plus, it should be easy to reproduce my findings too.

-+ Tatu +-

#33 From: "lfs_neves" <lfs_neves@...>
Date: Sun Feb 8, 2009 6:51 pm
Subject: WS testing (was Re: Re: Interesting Aalto reference, linux+jibx+aalto...)
lfs_neves
Send Email Send Email
 
--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta
<tsaloranta@...> wrote:
>
> Hi Luis! One idea occured to me today: I don't know how easy it would
> be to do, but looking at wstest home page, it might be doable,
> depending on how tightly coupled it is with soap and/or xml.

...

> Given above, it might be quite easy to implement json-based web
> service, where data binding is done using Jackson instead of JAXB. I
> would expect such a service to be still faster than Jibx

I've just posted my test results with JSON as an alternative
serialization mechanism using the Jackson Processor... yes it is fast:

http://technotes.blogs.sapo.pt/1708.html


I've had a small issue  in the process of porting the tests to JSON,
Jackson serialized 0.0f as "0.0" but was unable to deserialize it
back, it errors out with the message:
"java.lang.Float from String value '0.0': overflow/underflow, value
can not be represented as a 32-bit float"

Other than that it was painless. You've wrote another great parser!

Regards

--
Luis Neves

#34 From: Tatu Saloranta <tsaloranta@...>
Date: Mon Feb 9, 2009 4:47 am
Subject: Re: WS testing (was Re: Re: Interesting Aalto reference, linux+jibx+aalto...)
cowtowncoder
Send Email Send Email
 
On Sun, Feb 8, 2009 at 10:51 AM, lfs_neves <lfs_neves@...> wrote:
> --- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta
> <tsaloranta@...> wrote:
>>
>> Hi Luis! One idea occured to me today: I don't know how easy it would
>> be to do, but looking at wstest home page, it might be doable,
>> depending on how tightly coupled it is with soap and/or xml.
>
> ...
>
>> Given above, it might be quite easy to implement json-based web
>> service, where data binding is done using Jackson instead of JAXB. I
>> would expect such a service to be still faster than Jibx
>
> I've just posted my test results with JSON as an alternative
> serialization mechanism using the Jackson Processor... yes it is fast:
>
> http://technotes.blogs.sapo.pt/1708.html

Great!

> I've had a small issue in the process of porting the tests to JSON,
> Jackson serialized 0.0f as "0.0" but was unable to deserialize it
> back, it errors out with the message:
> "java.lang.Float from String value '0.0': overflow/underflow, value
> can not be represented as a 32-bit float"

Ah, thanks, that sounds like a bug -- I think it may be due to my
misunderstanding some of constants in Float/Double classes (MIN_VALUE
is epsilon, and not negative number with highest absolute value).
I'll definitely need to fix this.

> Other than that it was painless. You've wrote another great parser!

Thank you.

-+ Tatu +-

#35 From: Tatu Saloranta <tsaloranta@...>
Date: Wed Mar 11, 2009 6:22 am
Subject: Aalto performance compared to Thrift, Protocol Buffers; using someone else's tests
cowtowncoder
Send Email Send Email
 
Ok, here's some more interesting benchmark data. Let's start with a
measurement done by someone not associated with Aalto project:

http://www.eishay.com/2008/11/protobuf-with-option-optimize-for-speed.html

Doesn't look too good for Stax? Well, I thought I'll figure out what's
going on. Turns out that:

(a) Stax implementation is the reference implementation (yuck)
(b) For each single serialization/deserialization, a new
XMLInput/OutputFactory is created via factory.newInstance(). OUCH!

Fixing these obvious flaws, starting by using Woodstox improves
reading speed by 8x and writing by 10x. Which brings stax-based
solution to about 40% of speed for reading, and almost 100% speed for
writing (binary formats tend to be relatively faster to read than
write).

But plug in Aalto and results (numbers are milliseconds) are:
---
using Aalto as Stax impl:

warming up...
Starting
  ,Object create, Serializaton, Deserialization, Serilized Size
thrift, 1304.13260, 23069.41900, 24145.53400, 314
protobuf, 2081.43830, 26319.83200, 15060.01900, 217
java, 973.97880, 75996.27200, 260578.72200, 845
scala, 655.33490, 118616.22600, 548926.90300, 1473
stax, 1003.50770, 17027.86700, 27728.39200, 406

---

And it turns out that for this (real world, I think) use case, Aalto

(a) is bit faster at writing data than either Thrift or Protocol
Buffers (17 ms vs 23 ms vs 26 ms)
(b) is bit slower at reading data (27 vs 24 vs 15)
(c) -> end-to-end, all 3 are about as fast (44 vs 47 vs 41)

(and this despite the fact that message size ratios are 400:300:200)

So, it appears that Aalto is pretty efficient at what it does. I mean,
Protocol Buffers is supposed to be, what, 10 - 100x faster than xml.
So Aalto must be 10x - 100x faster than format it deals with. Not too
shabby!

-+ Tatu +

#36 From: Tatu Saloranta <tsaloranta@...>
Date: Sat Mar 21, 2009 6:00 am
Subject: Minor release, 0.9.5, renamed packages -> com.fasterxml.aalto
cowtowncoder
Send Email Send Email
 
Quick note: I released 0.9.5, which has only one externally visible
change (in addition to some internal cleanup):
all code is now under package "com.fasterxml.aalto" (instead of older
'org.codehaus.wool', which was a leftover).
This does not necessarily require any changes to app code (since
services file uses the new factory class names), unless application
adds direct reference, or uses dependency injection to define
implementation.

In (hopefully) near future, I will move Aalto download pages to reside
under http://fasterxml.com as well, but for now they are still
available from cowtowncoder.com.

-+ Tatu +-

ps. I will try to get this:
  http://www.eishay.com/2009/03/more-on-benchmarking-java-serialization.html
  updated to also include Aalto (Woodstox and Jackson are included now)
since it should showcase Aalto performance. Esp. if I would also add
Fast Infoset...

#37 From: Tatu Saloranta <tsaloranta@...>
Date: Tue Mar 31, 2009 6:51 pm
Subject: Performance comparison, external
cowtowncoder
Send Email Send Email
 
(cc:ing to woodstox-users/dev, since it is related to Woodstox too...
as well as Aalto)

Apologies for posting this again, but I think that it's good to check out:

http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking

since there have been some changes, more things compared and so on.

So while it's not 100% clear which is fastest choice (Protocol
Buffers, Thrift or Json with Jackson), it's safe to say that
performance differences between these candidates are not huge. And
that's sort of amazing, considering how many things one has to give up
when using non-self-descriptive binary formats, without getting
order-of-magnitude faster performance. :)

Of course all the usual disclaimers apply: benchmarks are always
unfair, not relevant to all use cases and so on.
But at least here code is open source, methodology is simple,
repeatable, and implementations chosen are best-of-breed for data
formats.

-+ Tatu +-

#38 From: "plantfern" <fern@...>
Date: Tue Oct 13, 2009 11:30 pm
Subject: Status of Project??
plantfern
Send Email Send Email
 
Hi, I believe there is some interest in having an XML parser that supports NIO (
http://stackoverflow.com/questions/1045544/stax-parsing-from-java-nio-channel ).
And it looks like Aalto might be the only option at the moment, but it looks to
have stalled.

What is the current status of the project?  Any interest by developers?

#39 From: Tatu Saloranta <tsaloranta@...>
Date: Wed Oct 14, 2009 12:34 am
Subject: Re: Status of Project??
cowtowncoder
Send Email Send Email
 
Interesting -- hadn't seen that entry.
Current status is "waiting for interested users". :-)

Meaning that current functionality (core Stax 1.0 implementation; plus most of Stax2 extensions) is stable, usable and complete.
But next steps to take would be big (DTD support, hooking Stax2 validation API); except for fairly simple things to complete async API.

So... if there is interest for NIO part, we would be interested in working with others in this area.
One immediate thing to work on is just the API (how to feed content into parser, nothing fancy, how to check whether current state is acceptable parsing end state); and second one hooking up the rest to do the feeding.
Finally, there is one missing piece wrt parsing: handling of xml prolog. That is not a huge undertaking, but needs to be completed for real use (for now, one just has to strip out xml declaration to test async functionality).

Put another way: project is not dead, I have just been busy with other projects (mostly Jackson json parser).
Also, while adoption has been limited, there is at least one product now shipping with Aalto, so maybe it might be time to start "selling" Aalto bit more.

-+ Tatu +-

On Tue, Oct 13, 2009 at 4:30 PM, plantfern <fern@...> wrote:
 

Hi, I believe there is some interest in having an XML parser that supports NIO ( http://stackoverflow.com/questions/1045544/stax-parsing-from-java-nio-channel ). And it looks like Aalto might be the only option at the moment, but it looks to have stalled.

What is the current status of the project? Any interest by developers?



#40 From: Tatu Saloranta <tsaloranta@...>
Date: Thu Oct 15, 2009 5:43 am
Subject: Anyone interested in helping with non-blocking/async parsing use cases?
cowtowncoder
Send Email Send Email
 
Hi there! It has been a while since there's been significant progress
with Aalto -- mostly it's just because of other competing things going
on, but part of it has been due to:

(a) Core blocking (traditional) parser being feature complete, up to
complete Stax 1.0 compliancy, as well as Stax2 Typed Access API
implementation (there's still DTD handling to add, Stax2 Validation
API, but those are bigger undertakings)
(b) Apparent lack of interest for non-blocking parsing

But during past week I have had multiple contacts from developers who
would be interested in finding a non-blocking XML parser. Since Aalto
is almost there, I would be interested in completing minor missing
pieces.
To do that, what I really could use is a simple use case where to
plug-in such a component: ideally, a library, app or framework that is
accessing data using NIO (directly or via something like Netty). To
have something I could actually test with Aalto in non-blocking mode.
While I could write a toy test app that does not seem right -- it's
better to handle a real use case.

So... anyone with something I could use? Or willing to take to
collaborate on getting something like this done?

-+ Tatu +-

#41 From: "plantfern" <fern@...>
Date: Thu Oct 15, 2009 2:00 pm
Subject: Re: Anyone interested in helping with non-blocking/async parsing use cases?
plantfern
Send Email Send Email
 
It looks like the main motivator that I find is XMPP processing.  Since this is
based on an XML stream.  It's like an endless document, one root element (
<stream:stream> ), and lots of elements underneath that root element that carry
the communications.

So no DTD/Schema validation is required.  Because of the limitless nature, we
also need a way to emit DOM DocumentFragment(s) for each element, but not create
one huge document.

Currently the XMPP server I would like to enhance is Vysper based on the NIO
framework Mina.  They do some basic home-brewed XML parsing, but moving to a
standard one might be beneficial, but speed would be of importance too.

http://mina.apache.org/vysper/

What do you think?  I was just going to create an enhanced version of a SAX
parser for Mina.



--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta <tsaloranta@...>
wrote:
>
> Hi there! It has been a while since there's been significant progress
> with Aalto -- mostly it's just because of other competing things going
> on, but part of it has been due to:
>
> (a) Core blocking (traditional) parser being feature complete, up to
> complete Stax 1.0 compliancy, as well as Stax2 Typed Access API
> implementation (there's still DTD handling to add, Stax2 Validation
> API, but those are bigger undertakings)
> (b) Apparent lack of interest for non-blocking parsing
>
> But during past week I have had multiple contacts from developers who
> would be interested in finding a non-blocking XML parser. Since Aalto
> is almost there, I would be interested in completing minor missing
> pieces.
> To do that, what I really could use is a simple use case where to
> plug-in such a component: ideally, a library, app or framework that is
> accessing data using NIO (directly or via something like Netty). To
> have something I could actually test with Aalto in non-blocking mode.
> While I could write a toy test app that does not seem right -- it's
> better to handle a real use case.
>
> So... anyone with something I could use? Or willing to take to
> collaborate on getting something like this done?
>
> -+ Tatu +-
>

#42 From: Tatu Saloranta <tsaloranta@...>
Date: Fri Oct 16, 2009 12:25 am
Subject: Re: Re: Anyone interested in helping with non-blocking/async parsing use cases?
cowtowncoder
Send Email Send Email
 
On Thu, Oct 15, 2009 at 7:00 AM, plantfern <fern@...> wrote:
>
>
>
> It looks like the main motivator that I find is XMPP processing. Since this is
based on an XML stream. It's like an endless document, one root element (
<stream:stream> ), and lots of elements underneath that root element that carry
the communications.

Makes sense as far as use cases go.

> So no DTD/Schema validation is required. Because of the limitless nature, we
also need a way to emit DOM DocumentFragment(s) for each element, but not create
one huge document.

Ok.

> Currently the XMPP server I would like to enhance is Vysper based on the NIO
framework Mina. They do some basic home-brewed XML parsing, but moving to a
standard one might be beneficial, but speed would be of importance too.

Yeah -- and Aalto is very heavily optimized for speed; much of parser
code is shared between blocking and non-blocking parts (and rest was
branched fairly recently).

> http://mina.apache.org/vysper/
>
> What do you think? I was just going to create an enhanced version of a SAX
parser for Mina.

Let me have a look, sounds interesting so far. Writing XML parsers is
not trivial task, although doing it for specific use case of course
helps. But to get non-blocking part right, it get quite tricky to
handle anything from characters entities to decoding UTF-8 multi-byte
characters. Aalto does implement SAX too, as well as Stax; writing a
SAX parser is slightly easier than Stax, but fundamentally needing to
have "block at any given byte" ability is the trickiest thing.

-+ Tatu +-

#43 From: "plantfern" <fern@...>
Date: Fri Oct 16, 2009 12:37 am
Subject: Re: Anyone interested in helping with non-blocking/async parsing use cases?
plantfern
Send Email Send Email
 
The only sticking point that the people at Mina-Vysper mailing list brought up
is the licensing.  For Mina-Vysper to be able to use Aalto, the license needs to
be Apache compatible.. which I guess GPL and/or commercial licenses are not :(

What do you think of that?

--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta <tsaloranta@...>
wrote:
>
> On Thu, Oct 15, 2009 at 7:00 AM, plantfern <fern@...> wrote:
> >
> >
> >
> > It looks like the main motivator that I find is XMPP processing. Since this
is based on an XML stream. It's like an endless document, one root element (
<stream:stream> ), and lots of elements underneath that root element that carry
the communications.
>
> Makes sense as far as use cases go.
>
> > So no DTD/Schema validation is required. Because of the limitless nature, we
also need a way to emit DOM DocumentFragment(s) for each element, but not create
one huge document.
>
> Ok.
>
> > Currently the XMPP server I would like to enhance is Vysper based on the NIO
framework Mina. They do some basic home-brewed XML parsing, but moving to a
standard one might be beneficial, but speed would be of importance too.
>
> Yeah -- and Aalto is very heavily optimized for speed; much of parser
> code is shared between blocking and non-blocking parts (and rest was
> branched fairly recently).
>
> > http://mina.apache.org/vysper/
> >
> > What do you think? I was just going to create an enhanced version of a SAX
parser for Mina.
>
> Let me have a look, sounds interesting so far. Writing XML parsers is
> not trivial task, although doing it for specific use case of course
> helps. But to get non-blocking part right, it get quite tricky to
> handle anything from characters entities to decoding UTF-8 multi-byte
> characters. Aalto does implement SAX too, as well as Stax; writing a
> SAX parser is slightly easier than Stax, but fundamentally needing to
> have "block at any given byte" ability is the trickiest thing.
>
> -+ Tatu +-
>

#44 From: Fernando Padilla <fern@...>
Date: Fri Oct 16, 2009 12:37 am
Subject: Re: Re: Anyone interested in helping with non-blocking/async parsing use cases?
plantfern
Send Email Send Email
 
test

On 10/15/09 5:37 PM, plantfern wrote:
 

The only sticking point that the people at Mina-Vysper mailing list brought up is the licensing. For Mina-Vysper to be able to use Aalto, the license needs to be Apache compatible.. which I guess GPL and/or commercial licenses are not :(

What do you think of that?

--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta <tsaloranta@...> wrote:
>
> On Thu, Oct 15, 2009 at 7:00 AM, plantfern <fern@...> wrote:
> >
> >
> >
> > It looks like the main motivator that I find is XMPP processing. Since this is based on an XML stream. It's like an endless document, one root element ( <stream:stream> ), and lots of elements underneath that root element that carry the communications.
>
> Makes sense as far as use cases go.
>
> > So no DTD/Schema validation is required. Because of the limitless nature, we also need a way to emit DOM DocumentFragment(s) for each element, but not create one huge document.
>
> Ok.
>
> > Currently the XMPP server I would like to enhance is Vysper based on the NIO framework Mina. They do some basic home-brewed XML parsing, but moving to a standard one might be beneficial, but speed would be of importance too.
>
> Yeah -- and Aalto is very heavily optimized for speed; much of parser
> code is shared between blocking and non-blocking parts (and rest was
> branched fairly recently).
>
> > http://mina.apache.org/vysper/
> >
> > What do you think? I was just going to create an enhanced version of a SAX parser for Mina.
>
> Let me have a look, sounds interesting so far. Writing XML parsers is
> not trivial task, although doing it for specific use case of course
> helps. But to get non-blocking part right, it get quite tricky to
> handle anything from characters entities to decoding UTF-8 multi-byte
> characters. Aalto does implement SAX too, as well as Stax; writing a
> SAX parser is slightly easier than Stax, but fundamentally needing to
> have "block at any given byte" ability is the trickiest thing.
>
> -+ Tatu +-
>


#45 From: Tatu Saloranta <tsaloranta@...>
Date: Fri Oct 16, 2009 6:43 am
Subject: Re: Re: Anyone interested in helping with non-blocking/async parsing use cases?
cowtowncoder
Send Email Send Email
 
On Thu, Oct 15, 2009 at 5:37 PM, plantfern <fern@...> wrote:
>
> The only sticking point that the people at Mina-Vysper mailing list brought up
is the licensing. For Mina-Vysper to be able to use Aalto, the license needs to
be Apache compatible.. which I guess GPL and/or commercial licenses are not :(
>
> What do you think of that?

That could be problematic, yes, knowing the state of affairs between
GPL and Apache camps. :-)

I'll have to think a bit about that: GPL happens to be reasonable
match with commercial licensing (to divide usage into free and
non-free cases), but it has its downside too. So it may be time to
revise licensing question.

-+ Tatu +-

#46 From: "plantfern" <fern@...>
Date: Fri Oct 16, 2009 6:19 pm
Subject: Re: Anyone interested in helping with non-blocking/async parsing use cases?
plantfern
Send Email Send Email
 
Well, if there is serious discussion about changing the license you can try
joining the Mina mailing list, this might be a pretty good thing they might get
excited about. :)


--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta <tsaloranta@...>
wrote:
>
> On Thu, Oct 15, 2009 at 5:37 PM, plantfern <fern@...> wrote:
> >
> > The only sticking point that the people at Mina-Vysper mailing list brought
up is the licensing. For Mina-Vysper to be able to use Aalto, the license needs
to be Apache compatible.. which I guess GPL and/or commercial licenses are not
:(
> >
> > What do you think of that?
>
> That could be problematic, yes, knowing the state of affairs between
> GPL and Apache camps. :-)
>
> I'll have to think a bit about that: GPL happens to be reasonable
> match with commercial licensing (to divide usage into free and
> non-free cases), but it has its downside too. So it may be time to
> revise licensing question.
>
> -+ Tatu +-
>

#47 From: Tatu Saloranta <tsaloranta@...>
Date: Mon Oct 19, 2009 6:09 pm
Subject: Re: Re: Anyone interested in helping with non-blocking/async parsing use cases?
cowtowncoder
Send Email Send Email
 
On Fri, Oct 16, 2009 at 11:19 AM, plantfern <fern@...> wrote:
>
> Well, if there is serious discussion about changing the license you can try
joining the Mina mailing list, this might be a
> pretty good thing they might get excited about. :)

That's bit of chicken-and-egg problem. :)
(and impedance between selling a solution vs. having people with a
problem find you)

But I could definitely join the list. How is Vysper related to Mina
(or is it)? I do remember Mina, has been around for a while.

-+ Tatu +-

#48 From: "pnehrers2" <pnehrer@...>
Date: Sat Oct 24, 2009 2:51 am
Subject: Re: Anyone interested in helping with non-blocking/async parsing use cases?
pnehrers2
Send Email Send Email
 
I hereby declare my interest in an asynchronous xml parser solution :-)

I'm looking for a way to efficiently process xml input in a Netty-based server.
Right now, I have to spin another thread and let it block on reading an input
stream while feeding it with bytes every time I get a new buffer-ful from the
socket... quite ugly.

The problem with most parsers I investigated is that they are not written in a
way that would allow me to interrupt parsing (and essentially go back to the
last well-defined state) when no more bytes are available. This is basically how
Netty does packet defragmenting -- if you don't have enough bytes to move to the
next state, you stay in the current state and retry when you get more bytes.
There's probably a more efficient way -- like keeping a set of possible states
you can be in at any given byte, but that sounds more complicated.

If I had a StAX parser that would fail next() when it can't complete parsing
with currently available bytes and then let me retry the same next() again
later, I'd be all set :-)

Anyway, I'm not sure how I could help, but I'm interested (also, an
Apache-compatible license wouldn't hurt ;-))

--Peter

#49 From: Tatu Saloranta <tsaloranta@...>
Date: Sat Oct 24, 2009 3:28 am
Subject: Re: Re: Anyone interested in helping with non-blocking/async parsing use cases?
cowtowncoder
Send Email Send Email
 
On Fri, Oct 23, 2009 at 7:51 PM, pnehrers2 <pnehrer@...> wrote:
>
> I hereby declare my interest in an asynchronous xml parser solution :-)

Great! It seems that there are couple of other developers seriously
interested, so I think I better get back on track with development. :)

> I'm looking for a way to efficiently process xml input in a Netty-based
server. Right now, I have to spin another thread and let it block on reading an
input stream while feeding it with bytes every time I get a new buffer-ful from
the socket... quite ugly.

Yup.

> The problem with most parsers I investigated is that they are not written in a
way that would allow me to interrupt parsing (and essentially go back to the
last well-defined state) when no more bytes are available.
> This is basically how Netty does packet defragmenting -- if you don't have
enough bytes to move to the next state, you stay in the current state and retry
when you get more bytes. There's probably a more
> efficient way -- like keeping a set of possible states you can be in at any
given byte, but that sounds more complicated.

Yes indeed. Aalto's non-blocking mode is implemented to allow exactly
this: I have tested it with passing exactly one byte at a time,
ensuring it doesn't get confused. Implementing parsing is quite a bit
harder, but once you are done, it's rather neat; and speed for regular
bigger chunks is not much worse than using blocking regular IO (why
potentially slower? because of extra book keeping to retain state,
allow having to bail out)

> If I had a StAX parser that would fail next() when it can't complete parsing
with currently available bytes and then let me retry the same next() again
later, I'd be all set :-)

The way I am thinking of doing this would be to return something like
XMLStreamConstants.NOT_YET_AVAILABLE, so that should work.

In addition Aalto will implement SAX interface as well, building on Stax core.

> Anyway, I'm not sure how I could help, but I'm interested (also, an
Apache-compatible license wouldn't hurt ;-))

If and when we get things going, license issue will be resolved.
Initially licensing should only matter for anyone who has to be
distribute aalto artifacts -- GPL does not bind end users. But we
realize that there are concerns regarding GPL.
There is just the question of how to make it possible for FasterXML to
offer compelling business case for companies to use commercial
license.

As to helping, what I could really use are really just app/server
skeletons, in which to plug parser. I can prototype API, publish it,
let others play with it. I am just not good at writing "toy apps" --
right now I can work with blocking version, so I generally do that.
But with 'real' use case (something someone else has written :) ) it
is easier for me to get started on cleaning up non-blocking part.

Does this make sense?

-+ Tatu +-

#50 From: Tatu Saloranta <tsaloranta@...>
Date: Sun Oct 25, 2009 5:35 pm
Subject: Aalto now used in Sedna XQJ API
cowtowncoder
Send Email Send Email
 
Just thought I'll share one Aalto adoption with others: as per
[http://www.cfoster.net/news/sedna-xqj-beta1.xml], Sedna XQJ API
(XQuery processing lib) now uses Aalto as the fast xml parser to speed
up operation.
This should help Aalto project with getting feedback on production
use, experiences and so on.

Also: if anyone else is or will be using Aalto for some production
deployments, please let us know -- or if there's something blocking
from doing that (as in, you'd be interested but can't due to some
reason we could help it), let us know. You can contact developers
either via this list, or indirectly via "info@..." (which
includes developers but won't be cc:ed to the list).

Thanks!

-+ Tatu +-

#51 From: "martin.grotzke" <martin.grotzke@...>
Date: Mon Dec 7, 2009 8:24 am
Subject: How to disable element name verification
martin.grotzke
Send Email Send Email
 
Hi,

I'm trying to write a java object binding based on stax/aalto, this is aimed for
a serialization strategy for the memcached-session-manager
(http://code.google.com/p/memcached-session-manager/).

For serialization I use reflection, some properties are serialized as
attributes, others (e.g. complex attritubutes) are serialized as elements.

Right now I have an issue with element names containing a "$": for those I get
this error:

javax.xml.stream.XMLStreamException: Invalid name character '$' (code 36)) in
name ("this$0"), index #4
         at
com.fasterxml.aalto.out.XmlWriter.throwOutputError(XmlWriter.java:470)
         at com.fasterxml.aalto.out.XmlWriter.reportNwfName(XmlWriter.java:381)
         at
com.fasterxml.aalto.out.ByteXmlWriter.verifyNameComponent(ByteXmlWriter.java:259\
)
         at
com.fasterxml.aalto.out.ByteXmlWriter.constructName(ByteXmlWriter.java:179)
         at com.fasterxml.aalto.out.WNameTable.findSymbol(WNameTable.java:323)
         at
com.fasterxml.aalto.out.StreamWriterBase.writeStartElement(StreamWriterBase.java\
:676)
         at
de.javakaffee.web.msm.serializer.aalto.XMLBinding$OutputElement.add(XMLBinding.j\
ava:269)

Is there a way to disable this name verification, without breaking anything?
AFAICS I have no need for this check, as I'm the only one (respectively the
deserialization part of the code) who reads the created xml.

Thx && cheers,
Martin

#52 From: "martin.grotzke" <martin.grotzke@...>
Date: Mon Dec 7, 2009 8:11 pm
Subject: Where are the sources?
martin.grotzke
Send Email Send Email
 
Hi,

I just wanted to see if I might add an option to disable element name
verification by myself, unfortunately I couldn't find a link to the sources of
aalto anywhere.

Am I missing anything?

Thx && cheers,
Martin

#53 From: "Martin" <martin.grotzke@...>
Date: Mon Dec 7, 2009 8:30 pm
Subject: Re: Where are the sources?
martin.grotzke
Send Email Send Email
 
I just found it in the fasterxml.com wiki pages:

http://wiki.fasterxml.com/AaltoDownload

Cheers,
Martin


--- In aalto-xml-interest@yahoogroups.com, "martin.grotzke" <martin.grotzke@...>
wrote:
>
> Hi,
>
> I just wanted to see if I might add an option to disable element name
verification by myself, unfortunately I couldn't find a link to the sources of
aalto anywhere.
>
> Am I missing anything?
>
> Thx && cheers,
> Martin
>

#54 From: Tatu Saloranta <tsaloranta@...>
Date: Tue Dec 8, 2009 4:12 am
Subject: Re: How to disable element name verification
cowtowncoder
Send Email Send Email
 
On Mon, Dec 7, 2009 at 12:24 AM, martin.grotzke <martin.grotzke@...> wrote:
 

Hi,

I'm trying to write a java object binding based on stax/aalto, this is aimed for a serialization strategy for the memcached-session-manager (http://code.google.com/p/memcached-session-manager/).

For serialization I use reflection, some properties are serialized as attributes, others (e.g. complex attritubutes) are serialized as elements.

Right now I have an issue with element names containing a "$": for those I get this error:

javax.xml.stream.XMLStreamException: Invalid name character '$' (code 36)) in name ("this$0"), index #4

I would have to verify xml  specification to be 100% certain, but I think $ is not a valid xml name character. So resulting output would be non-well-formed (unparsable with xml parsers).
 

Is there a way to disable this name verification, without breaking anything? AFAICS I have no need for this check, as I'm the only one (respectively the deserialization part of the code) who reads the created xml.

It would probably be quite easy to add a feature to disable checks on writing. With Aalto I took approach of only implementing things that are requested, so amount of configurability is limited at this point.

So, I could easily add this for writers. But the follow-up question is probably whether you'd also want to parse such names with Aalto?
If so, it's slightly trickier; mostly because of symbol table handling (if such names are accepted, they'd go in symbol table, meaning that symbol table can not be shared with parsers that do not accept invalid names). It can be implemented, just needs to be carefully planned.

-+ Tatu +-
 

#55 From: Tatu Saloranta <tsaloranta@...>
Date: Tue Dec 8, 2009 4:23 am
Subject: Re: Re: Where are the sources?
cowtowncoder
Send Email Send Email
 
On Mon, Dec 7, 2009 at 12:30 PM, Martin <martin.grotzke@...> wrote:
 

I just found it in the fasterxml.com wiki pages:

http://wiki.fasterxml.com/AaltoDownload


Yes, was just about to send the link, glad you found it. I am hoping to add bit more content at fasterxml.com; and of course finally get started with the finalize-async-api task.

As to feature to disable checks; I would suggest adding it as an on/off feature (validate xml content or such; similar to what woodstox does, controls validation of textual content character validity).
That'd go in WriterConfig (F_VALIDATE_XML_NAME_CHARACTERS or such).
And in implementation probably just  add check where exception would be thrown.
Let me know if you need more help; shouldn't be too hard to find.

-+ Tatu +-
 

#56 From: "Martin" <martin.grotzke@...>
Date: Tue Dec 8, 2009 8:02 am
Subject: Re: How to disable element name verification
martin.grotzke
Send Email Send Email
 
I already tried to just disable the name verification, unfortunately on
deserialization it fails because of the symbol table you're mentioning.

What I'd need is just support for the "$" (AFAICS ATM :)).

The source in this part is not as intuitive as I'd expect, lots of really low
level stuff - probably that's what makes such a good performance :)

Do you have a pointer how to introduce the $ as a valid char also during
reading?

Thx && cheers,
Martin


--- In aalto-xml-interest@yahoogroups.com, Tatu Saloranta <tsaloranta@...>
wrote:
>
> On Mon, Dec 7, 2009 at 12:24 AM, martin.grotzke <
> martin.grotzke@...> wrote:
>
> >
> >
> > Hi,
> >
> > I'm trying to write a java object binding based on stax/aalto, this is
> > aimed for a serialization strategy for the memcached-session-manager (
> > http://code.google.com/p/memcached-session-manager/).
> >
> > For serialization I use reflection, some properties are serialized as
> > attributes, others (e.g. complex attritubutes) are serialized as elements.
> >
> > Right now I have an issue with element names containing a "$": for those I
> > get this error:
> >
> > javax.xml.stream.XMLStreamException: Invalid name character '$' (code 36))
> > in name ("this$0"), index #4
> >
> > I would have to verify xml  specification to be 100% certain, but I think $
> is not a valid xml name character. So resulting output would be
> non-well-formed (unparsable with xml parsers).
>
>
> > Is there a way to disable this name verification, without breaking
> > anything? AFAICS I have no need for this check, as I'm the only one
> > (respectively the deserialization part of the code) who reads the created
> > xml.
> >
> > It would probably be quite easy to add a feature to disable checks on
> writing. With Aalto I took approach of only implementing things that are
> requested, so amount of configurability is limited at this point.
>
> So, I could easily add this for writers. But the follow-up question is
> probably whether you'd also want to parse such names with Aalto?
> If so, it's slightly trickier; mostly because of symbol table handling (if
> such names are accepted, they'd go in symbol table, meaning that symbol
> table can not be shared with parsers that do not accept invalid names). It
> can be implemented, just needs to be carefully planned.
>
> -+ Tatu +-
>

Messages 27 - 56 of 75   Oldest  |  < Older  |  Newer >  |  Newest
Add to My Yahoo!      XML What's This?

Copyright İ 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help