Skip to search.

Breaking News Visit Yahoo! News for the latest.

×Close this window

json · JSON JavaScript Object Notation

The Yahoo! Groups Product Blog

Check it out!

Group Information

  • Members: 590
  • Category: Data Formats
  • Founded: Jul 19, 2005
  • Language: English
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Messages

Advanced
Messages Help
Messages 1560 - 1589 of 1953   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Show Message Summaries Sort by Date ^  
#1560 From: "Douglas Crockford" <douglas@...>
Date: Fri Dec 24, 2010 7:20 pm
Subject: Re: package org.json on Github
douglascrock...
Send Email Send Email
 
--- In json@yahoogroups.com, John Cowan <cowan@...> wrote:
>
> Douglas Crockford scripsit:
> > The reference implementation for Java is now available from Github.
> >
> > https://github.com/douglascrockford/JSON-java
>
> I'm having some trouble deciphering your XML class.  Can you write
> down in English the rules it uses to convert JSON to XML and vice versa?


What specifically are you having trouble with?

#1561 From: "Tony" <anthonyrpelosi@...>
Date: Thu Jan 6, 2011 9:22 pm
Subject: WYSIWYG tool for generating JSON and XML?
anthonyrpelosi
Send Email Send Email
 
Does anyone know of a good WYSIWYG tool for creating a hierarchy of
objects/elements and fields/attributes that exports to both XML and JSON? I
downloaded XMLSpy (PC only) but my VMWare is on the fritz right now so I haven't
tested that out yet.

#1562 From: Patrick Maupin <pmaupin@...>
Date: Thu Jan 6, 2011 9:48 pm
Subject: Re: WYSIWYG tool for generating JSON and XML?
patmaupin
Send Email Send Email
 
Not really a wysiwyg, but I have a different textual representation of JSON
called RSON.  It's extremely easy to work with in a text editor.

The manual is here:
http://code.google.com/p/rson/downloads/detail?name=rson_0_06_manual.pdf

An example file using this is here:
http://code.google.com/p/rst2pdf/source/browse/trunk/rst2pdf/styles/default.styl\
e?r=2417

It is extremely easy to parse this and then output JSON -- basically a 2
line python program once the libraries are installed.

There is also a RSON superset -> XML example here:
http://code.google.com/p/rson/source/browse/trunk/py2x/tools/testxml.py?r=103


Regards,
Pat


On Thu, Jan 6, 2011 at 3:22 PM, Tony <anthonyrpelosi@...> wrote:

>
>
> Does anyone know of a good WYSIWYG tool for creating a hierarchy of
> objects/elements and fields/attributes that exports to both XML and JSON? I
> downloaded XMLSpy (PC only) but my VMWare is on the fritz right now so I
> haven't tested that out yet.
>
>
>


[Non-text portions of this message have been removed]

#1563 From: "Douglas Crockford" <douglas@...>
Date: Mon Jan 24, 2011 6:48 pm
Subject: Java JSON
douglascrock...
Send Email Send Email
 
The JSON.org page currently lists 21 packages for Java. Will the Java community
ever converge on one as some other language communities are doing?

#1564 From: John Cowan <cowan@...>
Date: Mon Jan 24, 2011 7:05 pm
Subject: Re: Java JSON
johnwcowan
Send Email Send Email
 
Douglas Crockford scripsit:

> The JSON.org page currently lists 21 packages for Java. Will the Java
> community ever converge on one as some other language communities
> are doing?

I would say that that could happen in only a few ways:

1) If someone shepherded a particular library through the JCP, a clunky and
bureaucratic process whose future is much doubted

2) If you as JSON BDFL pushed one of them.

Also, google-gson is not just a plain JSON library, and jsonix is actually
a JavaScript, not a Java library.  There are probably other such errors.

--
John Cowan   cowan@...  http://www.ccil.org/~cowan
Most languages are dramatically underdescribed, and at least one is
dramatically overdescribed.  Still other languages are simultaneously
overdescribed and underdescribed.  Welsh pertains to the third category.
         --Alan King

#1565 From: Tatu Saloranta <tsaloranta@...>
Date: Mon Jan 24, 2011 9:18 pm
Subject: Re: Java JSON
cowtowncoder
Send Email Send Email
 
On Mon, Jan 24, 2011 at 10:48 AM, Douglas Crockford
<douglas@...> wrote:
> The JSON.org page currently lists 21 packages for Java. Will the Java
community ever converge on one as some other language communities are doing?

Probably not to the degree that there'd be just one. From the list I
think just maybe half a dozen are widely used for whatever that's
worth.
I think there is some actual convergence, especially in cases where
new projects choose from existing libraries. In JAX-RS (Jersey,
RESTeasy) space, for example, number of libraries that are supported
out of box is quite low, and tends to follow similar trajectory (start
with library that emulates xml processing, Jettison, or the reference
implementation; move on to lib(s) that do full data binding like
Jackson or json-lib).
I think authors of frameworks have more time and interest in doing due
diligence to figure out best components to use, which then limits
candidates that offer best set of features and support. And over time
users of frameworks seem to gravitate towards those libraries even
when developing stand-alone system.

One thing that irritates me is not so much number of alternatives but
the fact that most new candidates make bold claims but seem to offer
very little that is better or even different (in positive sense) from
existing choices.

-+ Tatu +-

#1566 From: Dennis Gearon <gearond@...>
Date: Mon Jan 24, 2011 10:10 pm
Subject: Re: Java JSON
gearond...
Send Email Send Email
 
If there was anything that bugs ME, is incomplete documentation and 'just read
the code' type of libraries.

  Dennis Gearon





________________________________
From: Tatu Saloranta <tsaloranta@...>
To: json@yahoogroups.com
Sent: Mon, January 24, 2011 1:18:43 PM
Subject: Re: [json] Java JSON


On Mon, Jan 24, 2011 at 10:48 AM, Douglas Crockford
<douglas@...> wrote:
> The JSON.org page currently lists 21 packages for Java. Will the Java
community
>ever converge on one as some other language communities are doing?

Probably not to the degree that there'd be just one. From the list I
think just maybe half a dozen are widely used for whatever that's
worth.
I think there is some actual convergence, especially in cases where
new projects choose from existing libraries. In JAX-RS (Jersey,
RESTeasy) space, for example, number of libraries that are supported
out of box is quite low, and tends to follow similar trajectory (start
with library that emulates xml processing, Jettison, or the reference
implementation; move on to lib(s) that do full data binding like
Jackson or json-lib).
I think authors of frameworks have more time and interest in doing due
diligence to figure out best components to use, which then limits
candidates that offer best set of features and support. And over time
users of frameworks seem to gravitate towards those libraries even
when developing stand-alone system.

One thing that irritates me is not so much number of alternatives but
the fact that most new candidates make bold claims but seem to offer
very little that is better or even different (in positive sense) from
existing choices.

-+ Tatu +-



[Non-text portions of this message have been removed]

#1567 From: John Cowan <cowan@...>
Date: Mon Jan 24, 2011 10:12 pm
Subject: Re: Java JSON
johnwcowan
Send Email Send Email
 
Dennis Gearon scripsit:

> If there was anything that bugs ME, is incomplete documentation and 'just read
> the code' type of libraries.

+1000

--
One Word to write them all,             John Cowan <cowan@...>
   One Access to find them,              http://www.ccil.org/~cowan
One Excel to count them all,
   And thus to Windows bind them.                --Mike Champion

#1568 From: Tatu Saloranta <tsaloranta@...>
Date: Mon Jan 24, 2011 11:39 pm
Subject: Re: Java JSON
cowtowncoder
Send Email Send Email
 
On Mon, Jan 24, 2011 at 2:10 PM, Dennis Gearon <gearond@...> wrote:
> If there was anything that bugs ME, is incomplete documentation and 'just read
> the code' type of libraries.

True. That also contributes to feeling of "but how is this different".
Maybe there are strengths, but if finding those requires reading
sources, yeah, that's lots to ask.

And it's also easier to accumulate documentation when projects grow;
and having small number of leading libs helps the snowball effect.

-+ Tatu +-

#1569 From: Dennis Gearon <gearond@...>
Date: Mon Jan 24, 2011 11:59 pm
Subject: Re: Java JSON
gearond...
Send Email Send Email
 
I'm working on documenting my own API (JSON, JSON-RPCish), and it's a lot of
work. I just keep reminding myself how I'd like to read it.



  Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.




________________________________
From: Tatu Saloranta <tsaloranta@...>
To: json@yahoogroups.com
Sent: Mon, January 24, 2011 3:39:02 PM
Subject: Re: [json] Java JSON


On Mon, Jan 24, 2011 at 2:10 PM, Dennis Gearon <gearond@...> wrote:
> If there was anything that bugs ME, is incomplete documentation and 'just read
> the code' type of libraries.

True. That also contributes to feeling of "but how is this different".
Maybe there are strengths, but if finding those requires reading
sources, yeah, that's lots to ask.

And it's also easier to accumulate documentation when projects grow;
and having small number of leading libs helps the snowball effect.

-+ Tatu +-



[Non-text portions of this message have been removed]

#1570 From: "codeWarrior" <gpatnude@...>
Date: Wed Jan 26, 2011 3:13 pm
Subject: Re: Java JSON
gregpatnude
Send Email Send Email
 
--- In json@yahoogroups.com, "Douglas Crockford" <douglas@...> wrote:
>
> The JSON.org page currently lists 21 packages for Java. Will the Java
community ever converge on one as some other language communities are doing?
>

Hopefully not... software is software... there's always more than one way to
skin the proverbial cat...

"there can be only one" is reminiscent of MicroSloth... As a developer, I
require options...

#1571 From: jonathan wallace <ninja9578@...>
Date: Thu Feb 3, 2011 1:54 pm
Subject: libjson 7.0 new features
ninja9578
Send Email Send Email
 
Hello all,

I just wanted to mention that a while ago I changed the name from libJSON to
libjson, would you mind reflecting that on json.org?

I added a ton of new stuff in this upgrade, most significantly, streaming
ability.  I got a bunch of requests for the ability to take a stream (like from
the internet) and parse it as it comes in.  Since it may be partial JSON, or
multiple JSON objects at a time, the stream will check each time something gets
added to it, and call a callback with the new node each time one is completed. 
This should make life a lot easier for those streaming JSON from websources.

I also added the option to turn off all libjson extensions such as comments,
hexidecimal support... so that only 100% compliant JSON is considered valid.

I exposed an interface to use libjson's base64 encoder and decoder since a few
people asked if they could use it.

There is a new makefile with lots more options, including an install script. 
(Thanks to Bernhard Fluehmann)

There were several other changes too, you can see them all in the changelog in
the documentation if you wish.
http://sourceforge.net/projects/libjson/

Jon




[Non-text portions of this message have been removed]

#1572 From: Tatu Saloranta <tsaloranta@...>
Date: Thu Feb 3, 2011 7:12 pm
Subject: Re: libjson 7.0 new features
cowtowncoder
Send Email Send Email
 
Interesting. One thing I noticed from the project page is that there
are big claims on performance, but it seems to lack links to actual
measurements? I was wondering if you can add links, so it is possible
to see actual performance numbers, figure out relative importance of
performance and so on
I have noticed that at least half of all JSON projects claim to be
faster than anyone else, so measurements could also clear up the
situation and keep everyone honest.

-+ Tatu +-

On Thu, Feb 3, 2011 at 5:54 AM, jonathan wallace <ninja9578@...> wrote:
> Hello all,
>
> I just wanted to mention that a while ago I changed the name from libJSON to
libjson, would you mind reflecting that on json.org?
>
> I added a ton of new stuff in this upgrade, most significantly, streaming
ability.  I got a bunch of requests for the ability to take a stream (like from
the internet) and parse it as it comes in.  Since it may be partial JSON, or
multiple JSON objects at a time, the stream will check each time something gets
added to it, and call a callback with the new node each time one is completed.
 This should make life a lot easier for those streaming JSON from websources.
>
> I also added the option to turn off all libjson extensions such as comments,
hexidecimal support... so that only 100% compliant JSON is considered valid.
>
> I exposed an interface to use libjson's base64 encoder and decoder since a few
people asked if they could use it.
>
> There is a new makefile with lots more options, including an install script.
 (Thanks to Bernhard Fluehmann)
>
> There were several other changes too, you can see them all in the changelog in
the documentation if you wish.
> http://sourceforge.net/projects/libjson/
>
> Jon
>
>
>
>
> [Non-text portions of this message have been removed]
>
>
>
> ------------------------------------
>
> Yahoo! Groups Links
>
>
>
>

#1573 From: David Graham <david.malcom.graham@...>
Date: Thu Feb 3, 2011 7:36 pm
Subject: Re: libjson 7.0 new features
dgraham...
Send Email Send Email
 
If you'd prefer a really slow JSON parser, check out
https://github.com/dgraham/json-stream.  It's about 20x slower than the Ruby
json gem, but it's quite handy if you need to parse a single huge JSON
document in constant memory space.

Enjoy!

David


On Thu, Feb 3, 2011 at 12:12 PM, Tatu Saloranta <tsaloranta@...>wrote:

>
>
> Interesting. One thing I noticed from the project page is that there
> are big claims on performance, but it seems to lack links to actual
> measurements? I was wondering if you can add links, so it is possible
> to see actual performance numbers, figure out relative importance of
> performance and so on
> I have noticed that at least half of all JSON projects claim to be
> faster than anyone else, so measurements could also clear up the
> situation and keep everyone honest.
>
> -+ Tatu +-
>
> On Thu, Feb 3, 2011 at 5:54 AM, jonathan wallace
<ninja9578@...<ninja9578%40yahoo.com>>
> wrote:
> > Hello all,
> >
> > I just wanted to mention that a while ago I changed the name from libJSON
> to libjson, would you mind reflecting that on json.org?
> >
> > I added a ton of new stuff in this upgrade, most significantly, streaming
> ability.  I got a bunch of requests for the ability to take a stream (like
> from the internet) and parse it as it comes in.  Since it may be partial
> JSON, or multiple JSON objects at a time, the stream will check each time
> something gets added to it, and call a callback with the new node each time
> one is completed.  This should make life a lot easier for those streaming
> JSON from websources.
> >
> > I also added the option to turn off all libjson extensions such as
> comments, hexidecimal support... so that only 100% compliant JSON is
> considered valid.
> >
> > I exposed an interface to use libjson's base64 encoder and decoder since
> a few people asked if they could use it.
> >
> > There is a new makefile with lots more options, including an install
> script.  (Thanks to Bernhard Fluehmann)
> >
> > There were several other changes too, you can see them all in the
> changelog in the documentation if you wish.
> > http://sourceforge.net/projects/libjson/
> >
> > Jon
> >
> >
> >
> >
> > [Non-text portions of this message have been removed]
> >
> >
> >
> > ------------------------------------
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
>
>


[Non-text portions of this message have been removed]

#1574 From: Gregg Irwin <gregg.irwin@...>
Date: Thu Feb 3, 2011 8:00 pm
Subject: Re[2]: libjson 7.0 new features
greggirwin143
Send Email Send Email
 
TS> I have noticed that at least half of all JSON projects claim to be
TS> faster than anyone else, so measurements could also clear up the
TS> situation and keep everyone honest.

A standard JSON benchmark perhaps?

--Gregg

#1575 From: Tatu Saloranta <tsaloranta@...>
Date: Fri Feb 4, 2011 12:03 am
Subject: Re: Re[2]: libjson 7.0 new features
cowtowncoder
Send Email Send Email
 
On Thu, Feb 3, 2011 at 12:00 PM, Gregg Irwin <gregg.irwin@...> wrote:
> TS> I have noticed that at least half of all JSON projects claim to be
> TS> faster than anyone else, so measurements could also clear up the
> TS> situation and keep everyone honest.
>
> A standard JSON benchmark perhaps?

Yes, one would be very useful. I know there are couple for Java
(general purpose ones that can also use JSON; and specific ones), but
haven't seen many for other platforms, or comparing between platforms.

-+ Tatu +-

#1576 From: jonathan wallace <ninja9578@...>
Date: Mon Feb 7, 2011 2:11 pm
Subject: Re: Re[2]: libjson 7.0 new features
ninja9578
Send Email Send Email
 
If there is a standard JSON benchmark that would be nice.


I've only compared it to wxJSON, cJSON, and a few others.

"People have always been impressed by the power of our example, not the example
of our power." - William Jefferson Clinton


From: Tatu Saloranta <tsaloranta@...>
To: json@yahoogroups.com
Cc:
Sent: Thursday, February 3, 2011 7:03 PM
Subject: Re: Re[2]: [json] libjson 7.0 new features



On Thu, Feb 3, 2011 at 12:00 PM, Gregg Irwin <gregg.irwin@...> wrote:
> TS> I have noticed that at least half of all JSON projects claim to be
> TS> faster than anyone else, so measurements could also clear up the
> TS> situation and keep everyone honest.
>
> A standard JSON benchmark perhaps?

Yes, one would be very useful. I know there are couple for Java
(general purpose ones that can also use JSON; and specific ones), but
haven't seen many for other platforms, or comparing between platforms.

-+ Tatu +-






[Non-text portions of this message have been removed]

#1577 From: Tatu Saloranta <tsaloranta@...>
Date: Mon Feb 7, 2011 10:09 pm
Subject: Re: Re[2]: libjson 7.0 new features
cowtowncoder
Send Email Send Email
 
On Mon, Feb 7, 2011 at 6:11 AM, jonathan wallace <ninja9578@...> wrote:
> If there is a standard JSON benchmark that would be nice.
>
>
> I've only compared it to wxJSON, cJSON, and a few others.

I suspect many users would like to see comparisons. Maybe blog about
it or such (and include test code), or send a link if already
published?

I don't doubt at all that there are differences, given difference
skill & experience levels of implementors (and the common "simplest
must be fasters" fallacy wrt performance). It's just hard to find out
real numbers when project home pages do not show measurements, just
state results.

-+ Tatu +-

#1578 From: Jonathan Wallace <ninja9578@...>
Date: Tue Feb 8, 2011 6:05 pm
Subject: Re: Re[2]: libjson 7.0 new features
ninja9578
Send Email Send Email
 
Well there is a benchmark included in the source, but it's mostly for my own
purposes of comparing upgrade versions against the previous version.  I think
ill come up a set of common json tasks and implement them in a few libraries or
ask the library maintained to do it to be sure it's done right.

Sent from my iPhone

On Feb 7, 2011, at 17:09, Tatu Saloranta <tsaloranta@...> wrote:

> On Mon, Feb 7, 2011 at 6:11 AM, jonathan wallace <ninja9578@...> wrote:
> > If there is a standard JSON benchmark that would be nice.
> >
> >
> > I've only compared it to wxJSON, cJSON, and a few others.
>
> I suspect many users would like to see comparisons. Maybe blog about
> it or such (and include test code), or send a link if already
> published?
>
> I don't doubt at all that there are differences, given difference
> skill & experience levels of implementors (and the common "simplest
> must be fasters" fallacy wrt performance). It's just hard to find out
> real numbers when project home pages do not show measurements, just
> state results.
>
> -+ Tatu +-
>


[Non-text portions of this message have been removed]

#1579 From: Tatu Saloranta <tsaloranta@...>
Date: Tue Feb 8, 2011 7:20 pm
Subject: Re: Re[2]: libjson 7.0 new features
cowtowncoder
Send Email Send Email
 
On Tue, Feb 8, 2011 at 10:05 AM, Jonathan Wallace <ninja9578@...> wrote:
> Well there is a benchmark included in the source, but it's mostly for my own
purposes of comparing upgrade versions against the previous version.  I think
ill come up a set of common json tasks and implement them in a few libraries or
ask the library maintained to do it to be sure it's done right.

That would be very useful! I know it might be bit of work, given
differing APIs and all.. but then again, it should be beneficial for
authors of other packages to work on being able to run similar tests.
So it should be possible to get things bootstrapped. This is how
"jvm-serializers" (https://github.com/eishay/jvm-serializers) for Java
serialization libraries started, and seems to work quite well.

-+ Tatu +-

#1580 From: "johne_ganz" <johne_ganz@...>
Date: Sun Feb 13, 2011 1:04 am
Subject: JSONKit
johne_ganz
Send Email Send Email
 
Since this is a group dedicated to JSON, I just thought I'd post a note about
yet another Objective-C serializer / deserializer: JSONKit, available at
https://github.com/johnezang/JSONKit

Also, I suspect this message will reach the right people, but perhaps someone on
the list could forward this message on to webmaster for json.org and add JSONKit
to the list of Objective-C implementations?  Many thanks in advance.

#1581 From: "mehdigholam@..." <mehdigholam@...>
Date: Sun Feb 20, 2011 9:39 am
Subject: smallest fastest polymorphic json serializer for .net
mehdigholam...
Send Email Send Email
 
Hello all,

Follow the link for my .net implementation.

http://www.codeproject.com/KB/IP/fastJSON.aspx

Cheers

#1582 From: "johne_ganz" <john.engelhart@...>
Date: Thu Feb 24, 2011 1:22 am
Subject: JSON and the Unicode Standard
johne_ganz
Send Email Send Email
 
In RFC 4627, Section 3 Encoding, it states:
"JSON text SHALL be encoded in Unicode.  The default encoding is UTF-8."
Unicode is defined as: The Unicode Consortium, "The Unicode Standard
Version 4.0", 2003, <http://www.unicode.org/versions/Unicode4.1.0/>.
Is it safe to assume that RFC 4627 implies "The minimum Unicode Standard
is version 4.0", or does it mean "The Unicode Standard as defined in
version 4.0, and ONLY version 4.0" (i.e., later versions of the Unicode
standard are non-RFC 4627 conforming)?  The standard is silent on this
point, but I believe a "best practices" interpretation is "The minimum
Unicode Standard is version 4.0" with the implicit assumption that the
Unicode Standard is strongly motivated to preserve backwards
compatibility.  Is this the accepted interpretation?
Furthermore, I interpret the quoted RFC 4627 section to imply:
Where RFC 4627 is in conflict with the Unicode Standard, the Unicode
Standard interpretation shall be the one used unless explicitly and
unambiguously superseded by RFC 4627. Otherwise, by referencing the
Unicode Standard, the Unicode Standard is incorporated in to RFC 4627 as
part of the requirements for JSON.
In other words, JSON is built on top of Unicode.  When defining JSON,
the author(s) of RFC 4627 were aware of conflicts between what they were
defining (JSON) and the Unicode Standard (at the time, v4.0), and have
explicitly called out any exceptions that JSON requires.
Assuming this is a valid interpretation, this places a number of
requirements on a JSON implementation that are non-obvious by just
reading RFC 4627.  For example, from Unicode Standard (note: I'm using
the latest version at the time of this writing, 6.0), Chapter 3
Conformance, Section 3.4 Characters and Encoding
(http://www.unicode.org/versions/Unicode6.0.0/ch03.pdf):
C2  A process shall not interpret a noncharacter code point as an
abstract character. [ed: this is C5 in v4.0. The text appears to be
identical.]
D14  Noncharacter: A code point that is permanently reserved for
internal use and that should never be interchanged. Noncharacters
consist of the values U+nFFFE and U+nFFFF (where n is from 0 to 10_16
[ed: base 16]) and the values U+FDD0..U+FDEF. [ed: this is D7b in v4.0.
The text appears to be identical.]
Unicode Standard (6.0), Chapter 2 General Structure, Section 2.13
Special Characters and Noncharacters - Special Noncharacter Code Points:
The Unicode Standard contains a number of code points that are
intentionally not used to represent assigned characters. These code
points are known as noncharacters. They are permanently reserved for
internal use and should never be used for open interchange of Unicode
text. For more information on noncharacters, see Section 16.7,
Noncharacters. [ed: have not compared this to v4.0]
Unicode Standard (6.0), Chapter 16 Special Areas and Format Characters,
Section 16.7 Noncharacters:
Applications are free to use any of these noncharacter code points
internally but should never attempt to exchange them. If a noncharacter
is received in open interchange, an application is not required to
interpret it in any way. It is good practice, however, to recognize it
as a noncharacter and to take appropriate action, such as replacing it
with U+FFFD replacement character, to indicate the problem in the text.
It is not recommended to simply delete noncharacter code points from
such text, because of the potential security issues caused by deleting
uninterpreted characters. [ed: have not compared this to v4.0]
---------
This means strings like "\ufffe", "\ufdd0", "\ud83f\udfff" are
"noncharacters", and a plain reading of the standard clearly implies
that it is in some way "invalid" (I quote the term because the Unicode
standard has a lot to say about how to deal with this).  While the
examples given are the \u escaped variety, it should be obvious that the
(same) code points U+FFFE, U+FDD0, U+1FFFF encoded in their UTF-*
representation are also "invalid".  In UTF-8, this would be <EF BF BF>,
<EF B7 90>, <F0 9F BF BE>.
Unicode Standard (6.0), Chapter 3 Conformance, Section 3.9 Unicode
Encoding Forms (http://www.unicode.org/versions/Unicode6.0.0/ch03.pdf)
covers a lot of these details.  In particular, the section "Best
Practices for Using U+FFFD" gives details on using the special U+FFFD
replacement character to replace the "invalid" Unicode.  For example,
the \u escape sequence of '\ud83f\udfff' in a quoted string would be
replaced with a single U+FFFD.
It is important to note that there have been extensive changes to
section 3.9 between 4.0 and 6.0.  Some of these are due to various
security issues (http://www.unicode.org/faq/security.html).
Is the above the "generally agreed on" interpretation of how things
should be done?
Is it safe to say that a "strictly RFC 4627 conforming JSON
implementation" MUST also be "strictly Unicode Standard conforming" (at
least in terms of Chapter / Section 3 of the Unicode Standard,
"Conformance")?
Is there an opinion on whether or not JSON that is used for interchange
SHOULD NOT or MUST NOT contain "noncharacters"?  That is to say that a
JSON generator should/must not create JSON with noncharacters, and a
parser should/must either reject as invalid or replace such
noncharacters with U+FFFD?  There's technically a difference between
JSON used for interchange and JSON not used for interchange since the
Unicode Standard allows an implementation to use the noncharacters as
"internal, private" code points, but those characters should not be
present in the Unicode that (for some reasonable definition of) "leaves
the implementation".  Personally, I don't think such a distinction
should be made for JSON, or is really even meaningful, and all JSON
should/must be treated as "interchange".
The Unicode Standard, and in particular later versions of the standard,
for all practical purposes make it a requirement that "characters MUST
NOT be deleted".  One course of action is to simply not accept a string
and report an error, and another is to replace a bad or malformed
character with U+FFFD.  There are some very compelling security related
reasons for doing this.  Is there an opinion that a RFC 4627 JSON
implementation "MUST NOT arbitrarily delete characters" as well?  (This
is a somewhat complicated issue, see
http://www.unicode.org/faq/security.html and
http://www.unicode.org/reports/tr36/ for more info, in particular UTR#36
- Section 3 "Non-Visual Security Issues").


[Non-text portions of this message have been removed]

#1583 From: "Douglas Crockford" <douglas@...>
Date: Thu Feb 24, 2011 2:06 am
Subject: Re: JSON and the Unicode Standard
douglascrock...
Send Email Send Email
 
A receiver can do what it chooses to with the character codes it receives. If it
wants to delete them or reject them, that is its business. But a JSON channel
should not interfere with or bias the communication. It should faithfully
deliver what the sender sent, provided that it conforms to the JSON grammar.

If the sender wants to send characters that some consortium considers indecent,
and if the receiver wants to receive them, then that is their business.

#1584 From: "Douglas Crockford" <douglas@...>
Date: Thu Feb 24, 2011 2:18 am
Subject: Re: JSON and the Unicode Standard
douglascrock...
Send Email Send Email
 
When the informational RFC insists on Unicode, it is the sense that the encoding
is not EBCDIC nor Big5 nor anything other than Unicode.

#1585 From: "johne_ganz" <john.engelhart@...>
Date: Fri Feb 25, 2011 7:44 pm
Subject: Re: JSON and the Unicode Standard
johne_ganz
Send Email Send Email
 
--- In json@yahoogroups.com, "Douglas Crockford" <douglas@...> wrote:
>
> A receiver can do what it chooses to with the character codes it receives. If
it wants to delete them or reject them, that is its business.

Not if it's Unicode.  It is a common misconception that "Unicode" is just a set
of code points, like say ASCII or EBCDIC.  It is not.  With ASCII, you can
"delete" characters from a stream and it's still ASCII.  Code points in Unicode
have semantics, and deleting them can alter the meaning of the string in
surprising and unexpected ways.

Previous versions of the Unicode standard used to have a clause that permitted
the deleting of code points from a string (though it was not recommended, ala
SHOULD NOT).  Later versions of the Unicode standard do not permit this, and it
has been verboten to do so for some time.

Some of the most compelling reasons why deleting characters is forbidden is
covered by the section "Non-Visual Security Issues": 
http://www.unicode.org/reports/tr36/#Canonical_Represenation .

There is no language in RFC 4627 that I can find that supports your
interpretation.  There is an awful lot of language in the Unicode standard that
unambiguously says that you can not "delete" characters from a Unicode "string".
There is also compelling arguments in TR#36 for why arbitrarily deleting
characters is a huge mistake.

> But a JSON channel should not interfere with or bias the communication.

This statement is contrary to what you just said.  If it is deleting characters,
it is obviously interfering and biasing the communication.

A standard, and its interpretation, should strive to be unambiguous.  An
interpretation that boils down to "An implementation MAY interfere with or bias
the communication, but an implementation SHOULD NOT interfere with or bias the
communication" is meaningless and non-sensical.

> It should faithfully deliver what the sender sent, provided that it conforms
to the JSON grammar.

Which must be encoded as Unicode.  Again, Unicode IS NOT, and MUST NOT be
treated as a stream of Unicode code points.  That's not Unicode.  I freely admit
that this is a belief that I once had.  However, after a few years of dealing
with low level Unicode string processing (where Unicode means "The Unicode
Standard"), I no longer hold this view.  It's much more complicated and much
more nuanced than people realize.

> If the sender wants to send characters that some consortium considers
indecent, and if the receiver wants to receive them, then that is their
business.

I don't have a problem with this.  I would have a problem with such a set up
claiming "strictly RFC 4627 conforming" (or some language implying 4627
conformance).

My specific point is this:  I strongly believe that RFC 4627 requires Unicode,
and by implication, processing said Unicode in a Unicode Standard conforming
way.  Therefore, in order to claim "RFC 4627 conformance", one must also process
and handle the JSON in a way that is also "Unicode Standard conforming" as well.

You don't HAVE to do this, obviously.. but then you can no longer claim RFC 4627
conformance.

If I may make a suggestion:  perhaps an informal "JSON Best Practices" document
be started that catalogs and records these types of things.  The document would
be totally non-normative, but would be a fantastic resource for this who need to
implement JSON parsers and generators.  It would also help ensure that
implementations converge on something that ensures they will interoperate more
reliably.  Since it would be non-normative, it wouldn't have any "requirements"
weight to it, but I can tell you such a document would have been a big help to
me.

#1586 From: "Douglas Crockford" <douglas@...>
Date: Fri Feb 25, 2011 8:00 pm
Subject: Re: JSON and the Unicode Standard
douglascrock...
Send Email Send Email
 
--- In json@yahoogroups.com, "johne_ganz" <john.engelhart@...> wrote:
>
> --- In json@yahoogroups.com, "Douglas Crockford" <douglas@> wrote:
> >
> > A receiver can do what it chooses to with the character codes it receives.
If it wants to delete them or reject them, that is its business.
>
> Not if it's Unicode.  It is a common misconception that "Unicode" is just a
set of code points, like say ASCII or EBCDIC.

For JSON's purpose, Unicode is just a set of code points. It gives some, such as
{ and }, special meaning. But in strings, everything should simply be passed
through.

> Previous versions of the Unicode standard used to have a clause that permitted
the deleting of code points from a string (though it was not recommended, ala
SHOULD NOT).  Later versions of the Unicode standard do not permit this, and it
has been verboten to do so for some time.
>
> Some of the most compelling reasons why deleting characters is forbidden is
covered by the section "Non-Visual Security Issues": 
http://www.unicode.org/reports/tr36/#Canonical_Represenation .
>
> There is no language in RFC 4627 that I can find that supports your
interpretation.  There is an awful lot of language in the Unicode standard that
unambiguously says that you can not "delete" characters from a Unicode "string".
There is also compelling arguments in TR#36 for why arbitrarily deleting
characters is a huge mistake.
>
> > But a JSON channel should not interfere with or bias the communication.
>
> This statement is contrary to what you just said.  If it is deleting
characters, it is obviously interfering and biasing the communication.

By receiver I mean the program that ultimately receives the message. It can
interpret it and process it or damage it or ignore it as it will. What it does
with the data is none of my business. The JSON channel itself must do none of
those things.

> A standard, and its interpretation, should strive to be unambiguous.  An
interpretation that boils down to "An implementation MAY interfere with or bias
the communication, but an implementation SHOULD NOT interfere with or bias the
communication" is meaningless and non-sensical.
>
> > It should faithfully deliver what the sender sent, provided that it conforms
to the JSON grammar.
>
> Which must be encoded as Unicode.  Again, Unicode IS NOT, and MUST NOT be
treated as a stream of Unicode code points.  That's not Unicode.  I freely admit
that this is a belief that I once had.  However, after a few years of dealing
with low level Unicode string processing (where Unicode means "The Unicode
Standard"), I no longer hold this view.  It's much more complicated and much
more nuanced than people realize.
>
> > If the sender wants to send characters that some consortium considers
indecent, and if the receiver wants to receive them, then that is their
business.
>
> I don't have a problem with this.  I would have a problem with such a set up
claiming "strictly RFC 4627 conforming" (or some language implying 4627
conformance).
>
> My specific point is this:  I strongly believe that RFC 4627 requires Unicode,
and by implication, processing said Unicode in a Unicode Standard conforming
way.  Therefore, in order to claim "RFC 4627 conformance", one must also process
and handle the JSON in a way that is also "Unicode Standard conforming" as well.
>
> You don't HAVE to do this, obviously.. but then you can no longer claim RFC
4627 conformance.
>
> If I may make a suggestion:  perhaps an informal "JSON Best Practices"
document be started that catalogs and records these types of things.  The
document would be totally non-normative, but would be a fantastic resource for
this who need to implement JSON parsers and generators.  It would also help
ensure that implementations converge on something that ensures they will
interoperate more reliably.  Since it would be non-normative, it wouldn't have
any "requirements" weight to it, but I can tell you such a document would have
been a big help to me.


Tell you what. If you ever encounter a real problem, we will deal with that.

#1587 From: "johne_ganz" <john.engelhart@...>
Date: Fri Feb 25, 2011 11:09 pm
Subject: Re: JSON and the Unicode Standard
johne_ganz
Send Email Send Email
 
--- In json@yahoogroups.com, "Douglas Crockford" <douglas@...> wrote:
>
> --- In json@yahoogroups.com, "johne_ganz" <john.engelhart@> wrote:
> >
> > --- In json@yahoogroups.com, "Douglas Crockford" <douglas@> wrote:
> > >
> > > A receiver can do what it chooses to with the character codes it receives.
If it wants to delete them or reject them, that is its business.
> >
> > Not if it's Unicode.  It is a common misconception that "Unicode" is just a
set of code points, like say ASCII or EBCDIC.
>
> For JSON's purpose, Unicode is just a set of code points.

Not according to RFC 4627 it isn't.  Section 3, Encoding, "JSON text SHALL be
encoded in Unicode.", where SHALL is interpreted via RFC 2119 (i.e., SHALL is
synonymous with MUST).

I appreciate that your interpretation may have been your original intent, but
the scope of the language in the standard is far, far greater than "JSON text
SHALL be interpreted as a stream of disjoint Unicode code points.", which is
what you are arguing that the standard means.

Unless you can make a compelling argument with language from the RFC 4627
standard, the standard clearly and plainly says that the JSON text is encoded in
Unicode.  This means that the text must conform to the Unicode standard, and
it's rules for processing and handling text MUST (via the use of SHALL in RFC
4627) be followed.


> By receiver I mean the program that ultimately receives the message. It can
interpret it and process it or damage it or ignore it as it will. What it does
with the data is none of my business. The JSON channel itself must do none of
those things.

Surely you realize that in practice, this is not the way that things are done. 
All of the JSON libraries are effectively "part of the JSON channel".

There is a clear demarcation point where a piece of text has ceased to be JSON
and has (usually) become an instantiated data structure in the host language.

How and what the "language" does with the data is not relevant to RFC 4627.  The
"language" may manipulate the JSON data, examining keys, manipulating them in
any way it chooses.  But at this point, it very clearly has ceased to be "JSON".

Every JSON implementation that is in the form of a library for a host language
that I'm aware of could be interpreted to be "the program that ultimately
receives the message".  The libraries parse the JSON and transliterate it in to
a form useable by the host language.  How and what the host language, or program
written by someone to enumerate or manipulate the data structure that was
instantiated from the original JSON is obviously outside the scope of RFC 4627.

My pedantic point is: A JSON implementation, in the form of a library that
provides bindings between a host language and JSON (of which there are many),
MUST NOT arbitrarily delete characters in the original JSON.  Furthermore, any
such implementation MUST interpret the original JSON text in accordance with the
Unicode Standard.  Just like RFC 4627 gives a grammar and rules for how to
interpret JSON, the Unicode Standard has rules for how to interpret text encoded
as Unicode.  Unicode is not just a simple set of code points.

Another issue is normalization.  In particular, the way normalization is handled
for the "key" portion of an "object" (i.e., {"key": "value"}) can dramatically
alter the meaning and contents of the object.  For example:

{
"\u212b": "one",
"\u0041\u030a": "two",
"\u00c5": "three"
}

Are these three keys distinct?  Should there be a requirement that they MUST be
handled and interpreted such that they are distinct?  Does that requirement
extend past the "channel" demarcation point (i.e., not a JSON library or
communication channel used to interchange the JSON between two hosts) to the
"host language"?

In case it is not obvious, under the rules of Unicode NFC (Normalization Form
C), all three of the keys above will become "\u00c5" after NFC processing.

A first order approximation would seem to suggest that a JSON implementation
"should" use the precomposed form for keys, and for objects that contain keys
with non-precomposed keys that, when converted to their precomposed form are
duplicate with other keys, the behavior is undefined.

Again, this is another point where the use of Unicode introduces an awful lot of
non-obvious dependencies.  The Unicode standard has a lot to say about what it
means for two strings to "compare equal", and since JSON specifies what is
essentially a key/value hash table, it is critically important to define what
"equal" means for a key.  If the keys were ASCII or Binary, this would probably
be a non-issue, but its a pretty big one when you're dealing with Unicode.

> Tell you what. If you ever encounter a real problem, we will deal with that.

This is a rather snarky comment, and to be blunt, unprofessional and unfair.

Every point I've raised here is something that an implementor of a JSON library
will likely encounter.  As an implementor of such a library (for Objective-C),
everything I've raised here is something that took an enormous amount of time
and consideration.

In my case, I've had to deal with the subtle nuances of what happens to a
Unicode string when I parse it and then hand that parsed string off to another
library to instantiate a string object.  I have no control over how this
external library (a combination of Foundation and Core Foundation) deals with or
interprets various aspects of the Unicode Standard.  For the sake of argument,
if this external library automatically precomposes all strings it instantiates,
and I have to uses those instantiated strings as the keys in a NSDictionary (the
equivalent of a JSON object), I've got some problems.

Your snarky comment ignores the real world complexities that one faces when
attempting to create a "RFC 4627 compliant" JSON implementation, at least if one
is trying to do so "the right way" as opposed to a quick hack JSON
implementation.

For someone who is creating a JSON library or some other form of a JSON
implementation, the corner cases are usually far more important than the
obvious, common case.

#1588 From: Tatu Saloranta <tsaloranta@...>
Date: Fri Feb 25, 2011 11:19 pm
Subject: Re: Re: JSON and the Unicode Standard
cowtowncoder
Send Email Send Email
 
On Fri, Feb 25, 2011 at 3:09 PM, johne_ganz <john.engelhart@...> wrote:
>
>
> --- In json@yahoogroups.com, "Douglas Crockford" <douglas@...> wrote:
>>
>> --- In json@yahoogroups.com, "johne_ganz" <john.engelhart@> wrote:
>> >
>> > --- In json@yahoogroups.com, "Douglas Crockford" <douglas@> wrote:
>> > >
>> > > A receiver can do what it chooses to with the character codes it
receives. If it wants to delete them or reject them, that is its business.
>> >
>> > Not if it's Unicode.  It is a common misconception that "Unicode" is just a
set of code points, like say ASCII or EBCDIC.
>>
>> For JSON's purpose, Unicode is just a set of code points.
>
> Not according to RFC 4627 it isn't.  Section 3, Encoding, "JSON text SHALL be
encoded in Unicode.", where SHALL is interpreted via RFC 2119 (i.e., SHALL is
synonymous with MUST).

Do you have an ACTUAL problem worth discussion, or is this from just
purity standpoint?

-+ Tatu +-

#1589 From: Tatu Saloranta <tsaloranta@...>
Date: Fri Feb 25, 2011 11:45 pm
Subject: Re: Re: JSON and the Unicode Standard
cowtowncoder
Send Email Send Email
 
On Fri, Feb 25, 2011 at 3:09 PM, johne_ganz <john.engelhart@...> wrote:
...

> Unicode is not just a simple set of code points.

This is true statement, although the more practical question seems to
be what is the practical relationship of JSON with Unicode
specification.
I think your suggestions for clarifying some parts do make sense,
although it may be hard to reconcile basic diffences between full
Unicode support, and goals of simplicity for JSON.

>
> Another issue is normalization.  In particular, the way normalization is
handled for the "key" portion of an "object" (i.e., {"key": "value"}) can
dramatically alter the meaning and contents of the object.  For example:
>
> {
> "\u212b": "one",
> "\u0041\u030a": "two",
> "\u00c5": "three"
> }
>
> Are these three keys distinct?  Should there be a requirement that they MUST
be handled and interpreted such that they are distinct?  Does that requirement
extend past the "channel" demarcation point (i.e., not a JSON library or
communication channel used to interchange the JSON between two hosts) to the
"host language"?
>
> In case it is not obvious, under the rules of Unicode NFC (Normalization Form
C), all three of the keys above will become "\u00c5" after NFC processing.

For what it is worth, I have not seen a single JSON parser that would
do such normalization; and the only XML parser I recall even trying
proper Unicode code point normalization was XOM. This is not an
argument against proper handling, but rather an observation regarding
how much of a practical issue it seems to be.
Nor have I seen feature requests to support normalization (XOM
implements it because its author is very ambitious wrt supporting
standards, it is very respectable achievement), during time I have
spend maintaining XML and JSON parser/generator implementations.
Do others have difference experiences?

So to me it seems that most likely high-level clarifications regarding
normalization aspects would be:

(a) Whether to do normalization or not is up to implementation
(normalization is left out of scope, on purpose), or
(b) Say that with JSON no normalization would be done (which would be
more at odds with unicode spec)

Why? Just because I see very little chance of anything more ambitious
having effect on implementations (beyond small number that are willing
to tackle such complexity). While it would seem wrong to punt the
issue, there is the practical question of whether full solution would
matter.
My guess is that about last thing I implements would want was a
mandate to support full Unicode 4.0 (and above) normalization rules.
It would just mean that there would be the specification in one
corner; and implementations, practically none of which would be
compliant.

...
> Your snarky comment ignores the real world complexities that one faces when
attempting to create a "RFC 4627 compliant" JSON implementation, at least if one
is trying to do so "the right way" as opposed to a quick hack JSON
implementation.

For better or worse, most JSON implementations fall in quick hack
category; which is just to say that chances of getting significant
number of implementations to do much more than decoding code points
correctly is vanishingly small. Or that even getting them to do basic
decoding is quite a challenge in itself.

> For someone who is creating a JSON library or some other form of a JSON
implementation, the corner cases are usually far more important than the
obvious, common case.

True.

I think your suggestions of how this could be clarified make sense.

-+ Tatu +-

Messages 1560 - 1589 of 1953   Oldest  |  < Older  |  Newer >  |  Newest
Add to My Yahoo!      XML What's This?

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help