Check the IPTC website (www.iptc.org). DTDs and XML Schema files are available
there for both NewsML 1 and NewsML-G2, and a lot of great resources for working
with these standards.
There is also a NewsML-G2 yahoo news group
http://tech.groups.yahoo.com/group/newsml-g2/
Jayson Lorenzen
Senior Software Engineer
____________________________
B U S I N E S S W I R E
A Berkshire Hathaway Company
+1.415.986.4422, ext. 766
+1.415.956.2609 (fax)
www.BusinessWire.com
Business Wire/San Francisco
44 Montgomery St. 39th Floor
San Francisco, CA 94104
>>> "just4u_vijay" <just4u_vijay@...> 09/26/08 12:30 AM >>>
Well, This was fault in my understanding...
What I need is DTD/XML_Schema_Defination(XSD)for NewsML-G2 2.0.
Thanks
--- In newsml@yahoogroups.com, "odesk_sbullo" <sbullo@...> wrote:
>
> I found this: http://www.pdfzone.com/c/a/Utilities/Converting-
PDFs-
> to-XML/
>
> It may be helpful?
>
> Susanne
>
> --- In newsml@yahoogroups.com, vijay sharma <just4u_vijay@> wrote:
> >
> > Help Needed!
> >
> > I have been asked to create NewsML for my news paper's ARTICLE.
> >
> > They have told me to get LATEST STANDARD DOCUMENT for NewsML>
> >
> > I have pdf with me, what I need is STANDARD XML for NewsML.
> >
> > can anyone just help me in getting this document.
> >
> > Appreciate your help!
> >
> > Thanks
> >
>
Ah! Sorry. I don't think there's anything out there that converts
from PDF to NewsML XML *specifically* but I could be wrong.
Susanne
--- In newsml@yahoogroups.com, "just4u_vijay" <just4u_vijay@...>
wrote:
>
> Well, This was fault in my understanding...
>
> What I need is DTD/XML_Schema_Defination(XSD)for NewsML-G2 2.0.
>
> Thanks
>
> --- In newsml@yahoogroups.com, "odesk_sbullo" <sbullo@> wrote:
> >
> > I found this: http://www.pdfzone.com/c/a/Utilities/Converting-
> PDFs-
> > to-XML/
> >
> > It may be helpful?
> >
> > Susanne
> >
> > --- In newsml@yahoogroups.com, vijay sharma <just4u_vijay@> wrote:
> > >
> > > Help Needed!
> > >
> > > I have been asked to create NewsML for my news paper's ARTICLE.
> > >
> > > They have told me to get LATEST STANDARD DOCUMENT for NewsML>
> > >
> > > I have pdf with me, what I need is STANDARD XML for NewsML.
> > >
> > > can anyone just help me in getting this document.
> > >
> > > Appreciate your help!
> > >
> > > Thanks
> > >
> >
>
--- On Fri, 9/26/08, vijay sharma <just4u_vijay@...> wrote:
From: vijay sharma <just4u_vijay@...> Subject: [newsml] Any Sample/example of NewsML To: newsml@yahoogroups.com Date: Friday, September 26, 2008, 1:12 AM
Hello Folks,
I am looking for a sample file for NewsMl.
If you are using NewsML, please send me a full fledge sample copy.
Well, This was fault in my understanding...
What I need is DTD/XML_Schema_Defination(XSD)for NewsML-G2 2.0.
Thanks
--- In newsml@yahoogroups.com, "odesk_sbullo" <sbullo@...> wrote:
>
> I found this: http://www.pdfzone.com/c/a/Utilities/Converting-
PDFs-
> to-XML/
>
> It may be helpful?
>
> Susanne
>
> --- In newsml@yahoogroups.com, vijay sharma <just4u_vijay@> wrote:
> >
> > Help Needed!
> >
> > I have been asked to create NewsML for my news paper's ARTICLE.
> >
> > They have told me to get LATEST STANDARD DOCUMENT for NewsML>
> >
> > I have pdf with me, what I need is STANDARD XML for NewsML.
> >
> > can anyone just help me in getting this document.
> >
> > Appreciate your help!
> >
> > Thanks
> >
>
I found this: http://www.pdfzone.com/c/a/Utilities/Converting-PDFs-
to-XML/
It may be helpful?
Susanne
--- In newsml@yahoogroups.com, vijay sharma <just4u_vijay@...> wrote:
>
> Help Needed!
>
> I have been asked to create NewsML for my news paper's ARTICLE.
>
> They have told me to get LATEST STANDARD DOCUMENT for NewsML>
>
> I have pdf with me, what I need is STANDARD XML for NewsML.
>
> can anyone just help me in getting this document.
>
> Appreciate your help!
>
> Thanks
>
Hi Jason,
To answer to your questions:
1 & 2. Yes, in order to display news items (or collections of news items) on web
pages, we are usually using XSLT from NewsML1 to XHTML (or HTML).
We even provide to AFP customers (on demand) a simple XSLT toolkit to do so. We
also alternatively provide a simple Java app that stores the content of news
items in a RDBMS (therefore HTML can be generated via ASP/JSP/PHP pages). But
most NewsML users develop their own receiver application.
Be careful that real life NewsML implementation vary from providers to
providers.
3. The best guideline is without contest the NewsML guidelines at:
www.newsml.org/IPTC/NewsML/1.2/documentation/NewsML_1.2-doc-Guidelines_1.00.pdf
but you could also read my old NewsML for dummies still at:
http://xml.coverpages.org/NewsMLForDummies.pdf
Re applications, Reuters has created a long time ago a Java toolkit on
SourceForge for their NewsML flavor.
Laurent Le Meur
AFP
> -----Message d'origine-----
> De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part
> de j_lappa
> Envoyé : vendredi 30 mai 2008 21:19
> À : newsml@yahoogroups.com
> Objet : [newsml] NewsML Feeds and XHTML Translation
>
> Hello,
>
> I am a web designer for a financial reporting company. Recently we
> started receiving NewsML formatted feeds from our media provider. I
> have looked into NewsML and am very impressed with its rich feature
> set. However, I am at a loss as how to translate the xml based
> structure into xhtml. I have a few questions if any of you can provide
> me with some insight:
>
> 1. How does one translate a NewsML feed (into XHTML)?
> 2. Are any of you using XSLT technology in order to translate the
> feeds?
> 3. What are some good resources for NewsML developers?
>
> I am trying to find an entry point to produce a given feed into a file
> suited for a web browser. If you have any information that can help me
> please respond.
>
> Thank you,
>
> Jason Lappa
> Sr. Web Designer
> SNL Financial
>
>
>
> ------------------------------------
>
> Find more on NewsML at http://www.newsml.org
>
> Any member of this IPTC moderated Yahoo group must comply with the
> Intellectual Property Policy of the IPTC, available at
> http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted
> under the conditions of this IPTC IP Policy.
> Yahoo! Groups Links
>
>
>
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
Hello,
I am a web designer for a financial reporting company. Recently we
started receiving NewsML formatted feeds from our media provider. I
have looked into NewsML and am very impressed with its rich feature
set. However, I am at a loss as how to translate the xml based
structure into xhtml. I have a few questions if any of you can provide
me with some insight:
1. How does one translate a NewsML feed (into XHTML)?
2. Are any of you using XSLT technology in order to translate the feeds?
3. What are some good resources for NewsML developers?
I am trying to find an entry point to produce a given feed into a file
suited for a web browser. If you have any information that can help me
please respond.
Thank you,
Jason Lappa
Sr. Web Designer
SNL Financial
From: newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] On Behalf
Of Jayson Lorenzen
> Thanks Laurent, I agree it would be good to have in the
> Guidelines. It would be interesting to see how others
> are doing this and I hope this thread continues a bit
> (hint hint to all lurkers).
PA runs on a 24-hour news cycle. Our wire services are driven using
NITF, so we use du-key/@key and @version to indicate the threading of
our stories, with series/@series-part to indicate how pieces of the
story join together. Our du-key/@key is derived from the slugline.
In newsML terms, we map du-key/@key onto NewsItemId - so stories can
grow and have write-throughs whilst retainign the sane NewsItemId. We
spot parent and child stories, and use AssociatedWith to link up fromt
he child to the parent, and NewsItemRef in a
NewsComponent[/Role/@FormalName='Supporting'] hierarchy to link down
from parents to children.
Unfortunately what we don't have is a mechanism for carrying stories
across multiple days. Having not had the discussion with Editorial about
long-running stories and whether they really do 'replace' each other
across days, I'm not sure of we'd keep re-using the sane NewsItemId and
DateId and just jeep increasing RevisionId - if not, I think we'd create
a new dateId/NewsItemId pair and use DerivedFrom to point to yesterday's
story.
Paul
This e-mail is from the PA Group. For more information, see www.thepagroup.com.
This e-mail may contain confidential information.
Only the addressee is permitted to read, copy, distribute or otherwise use this
email or any attachments.
If you have received it in error, please contact the sender immediately.
Any opinion expressed in this e-mail is personal to the sender and may not
reflect the opinion of the PA Group.
Any e-mail reply to this address may be subject to interception or monitoring
for operational reasons or for lawful business practices.
Thanks Laurent, I agree it would be good to have in the Guidelines. It would be
interesting to see how others are doing this and I hope this thread continues a
bit (hint hint to all lurkers).
Jayson Lorenzen
Senior Software Engineer
____________________________
B U S I N E S S W I R E
A Berkshire Hathaway Company
+1.415.986.4422, ext. 766
+1.415.956.2609 (fax)
www.BusinessWire.com
Business Wire/San Francisco
44 Montgomery St. 39th Floor
San Francisco, CA 94104
>>> laurent.lemeur@... 05/02/08 10:09 PM >>>
>
> [Laurent Le Meur: Currently most wires use a slug term for this
> purpose (ex. OLY2008)]
>
> Could you please expand on that?
Doing it the old way: choosing a keyword and inserting it in the slug for all
stories. Most news agencies must still accommodate with old formats and
practices.
Re NewsML 1 solutions: AssociatedWith (and DerivedFrom) relate a news item to
another news item.
If you want to associate a news item with an entity (an event, a recurring
topic...) there is a solution in NewsML 1:
TopicOccurrence = An indication that a particular topic occurs within the
content of a NewsComponent. The optional HowPresent attribute indicates the
nature of that topic's occurrence. The value of the Topic attribute must consist
of a # character followed by the value of the Duid attribute of a Topic in the
current document.
It is a little complex to use (you need a local Topic structure), but I guess
you'll find some information in the "NewsML 1.2 guidelines", available on the
IPTC Web site.
By the way it would be good to have a NewsML 1 agreed upon solution for this,
inserted in the guidelines.
Laurent Le Meur
> -----Message d'origine-----
> De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part
> de masood_a
> Envoyé : vendredi 2 mai 2008 20:50
> À : newsml@yahoogroups.com
> Objet : [newsml] Re: Identifying story threads from NewsML
>
> Thanks for all the responses. These are very helpful.
>
> [Laurent Le Meur: Note NewsML-G2 defines a property named <instanceOf>
> specifically for this purpose ("thread" or "fixture" like "market
> opening").]
>
> [Misha: When using the NewsML-G2 Subject property to tie together News
> Items about an event one would place an Event identifier in the
> Subject property.]
>
> We are using NewsML 1.2 right now, so we may not be able to make use
> of the NewsML-G2 features. Though it appears that that may be a
> standard way of creating the association between different stories of
> a "thread". Also if all wires use this feature, it may make it easier
> for news processing systems.
>
> [Laurent Le Meur: Currently most wires use a slug term for this
> purpose (ex. OLY2008)]
>
> Could you please expand on that?
>
> [Jayson Lorenzen: /NewsML/NewsItem/NewsManagement/AssociatedWith:
> Create a document that identifies the thread and ties other documents
> together...]
>
> [Takahiro Fujiwara: /NewsML/NewsItem/NewsManagement/DerivedFrom: You
> can describe details in FormalName attribute]
>
> These are interesting suggestions.
>
> We would ideally like to identify an element that is set by the wires
> themselves or can be deduced on our end with a reasonable degree of
> reliability. Using the AssociatedWith element and creating a "meta"
> document would imply that this identifying document will be stored in
> the news storage system.
>
> Also it appears Reuters uses the
> /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
> element to identify/set the
> "thread" in NewsML.
>
> thanks,
> -Masood
>
>
> ------------------------------------
>
> Find more on NewsML at http://www.newsml.org
>
> Any member of this IPTC moderated Yahoo group must comply with the
> Intellectual Property Policy of the IPTC, available at
> http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted
> under the conditions of this IPTC IP Policy.
> Yahoo! Groups Links
>
>
>
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
>
> [Laurent Le Meur: Currently most wires use a slug term for this
> purpose (ex. OLY2008)]
>
> Could you please expand on that?
Doing it the old way: choosing a keyword and inserting it in the slug for all
stories. Most news agencies must still accommodate with old formats and
practices.
Re NewsML 1 solutions: AssociatedWith (and DerivedFrom) relate a news item to
another news item.
If you want to associate a news item with an entity (an event, a recurring
topic...) there is a solution in NewsML 1:
TopicOccurrence = An indication that a particular topic occurs within the
content of a NewsComponent. The optional HowPresent attribute indicates the
nature of that topic's occurrence. The value of the Topic attribute must consist
of a # character followed by the value of the Duid attribute of a Topic in the
current document.
It is a little complex to use (you need a local Topic structure), but I guess
you'll find some information in the "NewsML 1.2 guidelines", available on the
IPTC Web site.
By the way it would be good to have a NewsML 1 agreed upon solution for this,
inserted in the guidelines.
Laurent Le Meur
> -----Message d'origine-----
> De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part
> de masood_a
> Envoyé : vendredi 2 mai 2008 20:50
> À : newsml@yahoogroups.com
> Objet : [newsml] Re: Identifying story threads from NewsML
>
> Thanks for all the responses. These are very helpful.
>
> [Laurent Le Meur: Note NewsML-G2 defines a property named <instanceOf>
> specifically for this purpose ("thread" or "fixture" like "market
> opening").]
>
> [Misha: When using the NewsML-G2 Subject property to tie together News
> Items about an event one would place an Event identifier in the
> Subject property.]
>
> We are using NewsML 1.2 right now, so we may not be able to make use
> of the NewsML-G2 features. Though it appears that that may be a
> standard way of creating the association between different stories of
> a "thread". Also if all wires use this feature, it may make it easier
> for news processing systems.
>
> [Laurent Le Meur: Currently most wires use a slug term for this
> purpose (ex. OLY2008)]
>
> Could you please expand on that?
>
> [Jayson Lorenzen: /NewsML/NewsItem/NewsManagement/AssociatedWith:
> Create a document that identifies the thread and ties other documents
> together...]
>
> [Takahiro Fujiwara: /NewsML/NewsItem/NewsManagement/DerivedFrom: You
> can describe details in FormalName attribute]
>
> These are interesting suggestions.
>
> We would ideally like to identify an element that is set by the wires
> themselves or can be deduced on our end with a reasonable degree of
> reliability. Using the AssociatedWith element and creating a "meta"
> document would imply that this identifying document will be stored in
> the news storage system.
>
> Also it appears Reuters uses the
> /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
> element to identify/set the
> "thread" in NewsML.
>
> thanks,
> -Masood
>
>
> ------------------------------------
>
> Find more on NewsML at http://www.newsml.org
>
> Any member of this IPTC moderated Yahoo group must comply with the
> Intellectual Property Policy of the IPTC, available at
> http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted
> under the conditions of this IPTC IP Policy.
> Yahoo! Groups Links
>
>
>
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
Hi Jon,
NewsML isn't a classification system. It's a markup language.
Regards,
Misha
-----Original Message-----
From: newsml-g2@yahoogroups.com [mailto:newsml-g2@yahoogroups.com] On
Behalf Of Jon Garfunkel
Sent: 02 May 2008 22:00
To: newsml-g2@yahoogroups.com
Cc: newsml@yahoogroups.com
Subject: [newsml-g2] Google News talk
I'm at the NewsTools conference in Sunnyvale at Yahoo conference
listening
to Dan Meredith, a product manager at GoogleNews. He has soured on
classification systems because he calls them (1) unstable, (2)
unreliable.
I was going to ask him whether they've considered NewsML.
I'm at http://www.newstools.org/http://twitter.com/newstools2008
Jon
Jon Garfunkel
Boston, Mass.
http://civilities.net/
This email was sent to you by Thomson Reuters, the global news and information
company.
Any views expressed in this message are those of the individual sender, except
where the sender specifically states them to be the views of Thomson Reuters.
Thanks for all the responses. These are very helpful.
[Laurent Le Meur: Note NewsML-G2 defines a property named <instanceOf>
specifically for this purpose ("thread" or "fixture" like "market
opening").]
[Misha: When using the NewsML-G2 Subject property to tie together News
Items about an event one would place an Event identifier in the
Subject property.]
We are using NewsML 1.2 right now, so we may not be able to make use
of the NewsML-G2 features. Though it appears that that may be a
standard way of creating the association between different stories of
a "thread". Also if all wires use this feature, it may make it easier
for news processing systems.
[Laurent Le Meur: Currently most wires use a slug term for this
purpose (ex. OLY2008)]
Could you please expand on that?
[Jayson Lorenzen: /NewsML/NewsItem/NewsManagement/AssociatedWith:
Create a document that identifies the thread and ties other documents
together...]
[Takahiro Fujiwara: /NewsML/NewsItem/NewsManagement/DerivedFrom: You
can describe details in FormalName attribute]
These are interesting suggestions.
We would ideally like to identify an element that is set by the wires
themselves or can be deduced on our end with a reasonable degree of
reliability. Using the AssociatedWith element and creating a "meta"
document would imply that this identifying document will be stored in
the news storage system.
Also it appears Reuters uses the
/NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
element to identify/set the
"thread" in NewsML.
thanks,
-Masood
Hello J, masood,
How is /NewsML/NewsItem/NewsManagement/DerivedFrom ?
You can describe details in FormalName attribute.
BTW, this ML is for NewsML1. Therefore...
I wish we do not say easier that NewsML1 is not supporting ...
If people are looking standard NewsML1 or NewsML-G2 then I can agree to
recommend NewsML-G2.
However if those are using NewsML1 and have a question to adopt
function, in this case, recommending NewsML-G2 is not a resolution for
those. Thanks in advance.
Regards,
========================================================================
Takahiro Fujiwara, EAST Co., Ltd. Japan
Tel: +81 90 7262 9883 Fax: +81 48 298 1723
[RENEWAL!Schedule] http://tinyurl.com/22sar2
[Share my Google Calendar] http://tinyurl.com/yo4upf
------------------------------------------------------------------------
ViceChair of NewsML1-WorkingParty
IPTC (International Press Telecommunication Council)
LiaisonOfficer of SteeringCommittee & XML Evangelist
XML Consortium Japan
Certification Creators of Crossmedia-Expert
JAGAT (Japan Association of Graphic Arts Technology)
========================================================================
Please consider your environmental responsibility before printing
this e-mail on paper.
iLiad@...
-----Original Message-----
From: newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] On Behalf
Of Jayson Lorenzen
Sent: Friday, May 02, 2008 10:55 PM
To: masood_a@...; newsml@yahoogroups.com
Subject: Re: [newsml] Identifying story threads from NewsML
Hello masood, could you not use the
/NewsML/NewsItem/NewsManagement/AssociatedWith element? Create a
document that identifies the thread and ties other documents together,
with its own unique id of course. Then use the URN for this document in
the AssociatedWith element in subsequent documents in the thread.
j
Jayson Lorenzen
Senior Software Engineer
____________________________
B U S I N E S S W I R E
A Berkshire Hathaway Company
+1.415.986.4422, ext. 766
+1.415.956.2609 (fax)
www.BusinessWire.com
Business Wire/San Francisco
44 Montgomery St. 39th Floor
San Francisco, CA 94104
>>> masood_a@... 05/01/08 11:08 AM >>>
Hi-
We are trying to identify a field in NewsML that can be used to
identify a set of related stories consistently. This set of stories
can be about, lets say Iraq, or an event that is occurring on a single
day like "Plane Crash". We would like to always identify the latest
news article about a particular "thread". A thread may last for a
short time like a day or more longer, say months.
There are a few options available to us right now. I would like input
on the various options.
I am including a list of the fields (from the NewsML 1.2 spec) that
we are looking at below. From all the options below, it appears that
the field NameLabel may make more sense as this is in the category of
items that are "Informal Identifiers", and are not expected to be
using any markup.
It would be good to know the mechanism being used by major wires as
that can help us narrow down our options. I am including the
description of the elements from the NewsML spec for your convenience.
1. /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
[The SlugLine element provides a string of text, possibly embellished
by hyperlinks and/or formatting, used to display a NewsItem's slug
line. (Note that the meaning of the term "slug line", and the uses to
which it is put, are a matter for individual providers to define
within their own workflow and business practice.)]
2. NewsML/NewsItem/Identification/NameLabel
[The NameLabel element contains a string used by human users as a name
to help identify a NewsItem. Its form is determined by the provider.
It might be identical to the textual content of the SlugLine element,
for example, but even if this is so, the system should not process the
NameLabel as a slugline. Nothing can be assumed about the nature of
the string within NameLabel beyond the fact that it can help to
identify the NewsItem to humans.]
In addition NameLabel is in the category of items that are identified
as "InformalIdentifiers". [In addition to the formal identification
mechanisms described above, NewsML provides a series of Label elements
that can be used by human users to identify NewsItems. As far as the
NewsML system is concerned, these are arbitrary strings, and cannot be
relied upon to provide a robust identification mechanism. Their sole
purpose is to provide a convenient way for humans to identify a
particular NewsItem in informal exchanges and communications, or as
part of a user interface.]
3. /NewsML/NewsItem/Identification/NewsIdentitifier/NewsItemId
[The NewsItemId is an identifier for the NewsItem. The combination of
NewsItemId and DateId must be unique among NewsItems that emanate from
the same provider. Within these constraints, the NewsItemId can take
any form the provider wishes. It may take the form of a name for the
NewsItem that will be meaningful to humans, but this is not a
requirement
thanks,
-Masood
------------------------------------
Find more on NewsML at http://www.newsml.org
Any member of this IPTC moderated Yahoo group must comply with the
Intellectual Property Policy of the IPTC, available at
http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted
under the conditions of this IPTC IP Policy.
Yahoo! Groups Links
Hello masood, could you not use the
/NewsML/NewsItem/NewsManagement/AssociatedWith element? Create a
document that identifies the thread and ties other documents together,
with its own unique id of course. Then use the URN for this document in
the AssociatedWith element in subsequent documents in the thread.
j
Jayson Lorenzen
Senior Software Engineer
____________________________
B U S I N E S S W I R E
A Berkshire Hathaway Company
+1.415.986.4422, ext. 766
+1.415.956.2609 (fax)
www.BusinessWire.com
Business Wire/San Francisco
44 Montgomery St. 39th Floor
San Francisco, CA 94104
>>> masood_a@... 05/01/08 11:08 AM >>>
Hi-
We are trying to identify a field in NewsML that can be used to
identify a set of related stories consistently. This set of stories
can be about, lets say Iraq, or an event that is occurring on a single
day like "Plane Crash". We would like to always identify the latest
news article about a particular "thread". A thread may last for a
short time like a day or more longer, say months.
There are a few options available to us right now. I would like input
on the various options.
I am including a list of the fields (from the NewsML 1.2 spec) that
we are looking at below. From all the options below, it appears that
the field NameLabel may make more sense as this is in the category of
items that are "Informal Identifiers", and are not expected to be
using any markup.
It would be good to know the mechanism being used by major wires as
that can help us narrow down our options. I am including the
description of the elements from the NewsML spec for your convenience.
1. /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
[The SlugLine element provides a string of text, possibly embellished
by hyperlinks and/or formatting, used to display a NewsItem's slug
line. (Note that the meaning of the term "slug line", and the uses to
which it is put, are a matter for individual providers to define
within their own workflow and business practice.)]
2. NewsML/NewsItem/Identification/NameLabel
[The NameLabel element contains a string used by human users as a name
to help identify a NewsItem. Its form is determined by the provider.
It might be identical to the textual content of the SlugLine element,
for example, but even if this is so, the system should not process the
NameLabel as a slugline. Nothing can be assumed about the nature of
the string within NameLabel beyond the fact that it can help to
identify the NewsItem to humans.]
In addition NameLabel is in the category of items that are identified
as "InformalIdentifiers". [In addition to the formal identification
mechanisms described above, NewsML provides a series of Label elements
that can be used by human users to identify NewsItems. As far as the
NewsML system is concerned, these are arbitrary strings, and cannot be
relied upon to provide a robust identification mechanism. Their sole
purpose is to provide a convenient way for humans to identify a
particular NewsItem in informal exchanges and communications, or as
part of a user interface.]
3. /NewsML/NewsItem/Identification/NewsIdentitifier/NewsItemId
[The NewsItemId is an identifier for the NewsItem. The combination of
NewsItemId and DateId must be unique among NewsItems that emanate from
the same provider. Within these constraints, the NewsItemId can take
any form the provider wishes. It may take the form of a name for the
NewsItem that will be meaningful to humans, but this is not a
requirement
thanks,
-Masood
When using the NewsML-G2 Subject property to tie together News Items
about an event one would place an Event identifier in the Subject
property.
Misha
-----Original Message-----
From: newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] On Behalf Of
Laurent LE MEUR
Sent: 02 May 2008 13:18
To: newsml@yahoogroups.com
Subject: RE: [newsml] Identifying story threads from NewsML
It is true that NewsML 1 did not tackle this topic.
* NewsItemId must not be used for this purpose, as it is part of the unique
identification of the news item.
* NameLabel would not be my favorite choice, as it aims at identifying a single
news item (and not a "thread").
* Currently most wires use a slug term for this purpose (ex. OLY2008).
Note NewsML-G2 defines a property named <instanceOf> specifically for this
purpose ("thread" or "fixture" like "market opening").
Note also that if the "thread" is an entity (an event, a location) that is the
"subject" of a set of news items, you can use the Newsml-G2 <subject> property
to identify this entity.
Best regards
Laurent Le Meur
AFP
> -----Message d'origine-----
> De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part
> de masood_a
> Envoyé : jeudi 1 mai 2008 20:08
> À : newsml@yahoogroups.com
> Objet : [newsml] Identifying story threads from NewsML
>
> Hi-
>
> We are trying to identify a field in NewsML that can be used to
> identify a set of related stories consistently. This set of stories
> can be about, lets say Iraq, or an event that is occurring on a single
> day like "Plane Crash". We would like to always identify the latest
> news article about a particular "thread". A thread may last for a
> short time like a day or more longer, say months.
>
> There are a few options available to us right now. I would like input
> on the various options.
>
> I am including a list of the fields (from the NewsML 1.2 spec) that
> we are looking at below. From all the options below, it appears that
> the field NameLabel may make more sense as this is in the category of
> items that are "Informal Identifiers", and are not expected to be
> using any markup.
>
> It would be good to know the mechanism being used by major wires as
> that can help us narrow down our options. I am including the
> description of the elements from the NewsML spec for your convenience.
>
> 1. /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
>
> [The SlugLine element provides a string of text, possibly embellished
> by hyperlinks and/or formatting, used to display a NewsItem's slug
> line. (Note that the meaning of the term "slug line", and the uses to
> which it is put, are a matter for individual providers to define
> within their own workflow and business practice.)]
>
> 2. NewsML/NewsItem/Identification/NameLabel
>
> [The NameLabel element contains a string used by human users as a name
> to help identify a NewsItem. Its form is determined by the provider.
> It might be identical to the textual content of the SlugLine element,
> for example, but even if this is so, the system should not process the
> NameLabel as a slugline. Nothing can be assumed about the nature of
> the string within NameLabel beyond the fact that it can help to
> identify the NewsItem to humans.]
>
> In addition NameLabel is in the category of items that are identified
> as "InformalIdentifiers". [In addition to the formal identification
> mechanisms described above, NewsML provides a series of Label elements
> that can be used by human users to identify NewsItems. As far as the
> NewsML system is concerned, these are arbitrary strings, and cannot be
> relied upon to provide a robust identification mechanism. Their sole
> purpose is to provide a convenient way for humans to identify a
> particular NewsItem in informal exchanges and communications, or as
> part of a user interface.]
>
> 3. /NewsML/NewsItem/Identification/NewsIdentitifier/NewsItemId
>
> [The NewsItemId is an identifier for the NewsItem. The combination of
> NewsItemId and DateId must be unique among NewsItems that emanate from
> the same provider. Within these constraints, the NewsItemId can take
> any form the provider wishes. It may take the form of a name for the
> NewsItem that will be meaningful to humans, but this is not a
> requirement
>
> thanks,
> -Masood
This email was sent to you by Thomson Reuters, the global news and information
company.
Any views expressed in this message are those of the individual sender, except
where the sender specifically states them to be the views of Thomson Reuters.
It is true that NewsML 1 did not tackle this topic.
* NewsItemId must not be used for this purpose, as it is part of the unique
identification of the news item.
* NameLabel would not be my favorite choice, as it aims at identifying a single
news item (and not a "thread").
* Currently most wires use a slug term for this purpose (ex. OLY2008).
Note NewsML-G2 defines a property named <instanceOf> specifically for this
purpose ("thread" or "fixture" like "market opening").
Note also that if the "thread" is an entity (an event, a location) that is the
"subject" of a set of news items, you can use the Newsml-G2 <subject> property
to identify this entity.
Best regards
Laurent Le Meur
AFP
> -----Message d'origine-----
> De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part
> de masood_a
> Envoyé : jeudi 1 mai 2008 20:08
> À : newsml@yahoogroups.com
> Objet : [newsml] Identifying story threads from NewsML
>
> Hi-
>
> We are trying to identify a field in NewsML that can be used to
> identify a set of related stories consistently. This set of stories
> can be about, lets say Iraq, or an event that is occurring on a single
> day like "Plane Crash". We would like to always identify the latest
> news article about a particular "thread". A thread may last for a
> short time like a day or more longer, say months.
>
> There are a few options available to us right now. I would like input
> on the various options.
>
> I am including a list of the fields (from the NewsML 1.2 spec) that
> we are looking at below. From all the options below, it appears that
> the field NameLabel may make more sense as this is in the category of
> items that are "Informal Identifiers", and are not expected to be
> using any markup.
>
> It would be good to know the mechanism being used by major wires as
> that can help us narrow down our options. I am including the
> description of the elements from the NewsML spec for your convenience.
>
> 1. /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
>
> [The SlugLine element provides a string of text, possibly embellished
> by hyperlinks and/or formatting, used to display a NewsItem's slug
> line. (Note that the meaning of the term "slug line", and the uses to
> which it is put, are a matter for individual providers to define
> within their own workflow and business practice.)]
>
> 2. NewsML/NewsItem/Identification/NameLabel
>
> [The NameLabel element contains a string used by human users as a name
> to help identify a NewsItem. Its form is determined by the provider.
> It might be identical to the textual content of the SlugLine element,
> for example, but even if this is so, the system should not process the
> NameLabel as a slugline. Nothing can be assumed about the nature of
> the string within NameLabel beyond the fact that it can help to
> identify the NewsItem to humans.]
>
> In addition NameLabel is in the category of items that are identified
> as "InformalIdentifiers". [In addition to the formal identification
> mechanisms described above, NewsML provides a series of Label elements
> that can be used by human users to identify NewsItems. As far as the
> NewsML system is concerned, these are arbitrary strings, and cannot be
> relied upon to provide a robust identification mechanism. Their sole
> purpose is to provide a convenient way for humans to identify a
> particular NewsItem in informal exchanges and communications, or as
> part of a user interface.]
>
> 3. /NewsML/NewsItem/Identification/NewsIdentitifier/NewsItemId
>
> [The NewsItemId is an identifier for the NewsItem. The combination of
> NewsItemId and DateId must be unique among NewsItems that emanate from
> the same provider. Within these constraints, the NewsItemId can take
> any form the provider wishes. It may take the form of a name for the
> NewsItem that will be meaningful to humans, but this is not a
> requirement
>
> thanks,
> -Masood
>
>
> ------------------------------------
>
> Find more on NewsML at http://www.newsml.org
>
> Any member of this IPTC moderated Yahoo group must comply with the
> Intellectual Property Policy of the IPTC, available at
> http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted
> under the conditions of this IPTC IP Policy.
> Yahoo! Groups Links
>
>
>
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
Hi-
We are trying to identify a field in NewsML that can be used to
identify a set of related stories consistently. This set of stories
can be about, lets say Iraq, or an event that is occurring on a single
day like "Plane Crash". We would like to always identify the latest
news article about a particular "thread". A thread may last for a
short time like a day or more longer, say months.
There are a few options available to us right now. I would like input
on the various options.
I am including a list of the fields (from the NewsML 1.2 spec) that
we are looking at below. From all the options below, it appears that
the field NameLabel may make more sense as this is in the category of
items that are "Informal Identifiers", and are not expected to be
using any markup.
It would be good to know the mechanism being used by major wires as
that can help us narrow down our options. I am including the
description of the elements from the NewsML spec for your convenience.
1. /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
[The SlugLine element provides a string of text, possibly embellished
by hyperlinks and/or formatting, used to display a NewsItem's slug
line. (Note that the meaning of the term "slug line", and the uses to
which it is put, are a matter for individual providers to define
within their own workflow and business practice.)]
2. NewsML/NewsItem/Identification/NameLabel
[The NameLabel element contains a string used by human users as a name
to help identify a NewsItem. Its form is determined by the provider.
It might be identical to the textual content of the SlugLine element,
for example, but even if this is so, the system should not process the
NameLabel as a slugline. Nothing can be assumed about the nature of
the string within NameLabel beyond the fact that it can help to
identify the NewsItem to humans.]
In addition NameLabel is in the category of items that are identified
as "InformalIdentifiers". [In addition to the formal identification
mechanisms described above, NewsML provides a series of Label elements
that can be used by human users to identify NewsItems. As far as the
NewsML system is concerned, these are arbitrary strings, and cannot be
relied upon to provide a robust identification mechanism. Their sole
purpose is to provide a convenient way for humans to identify a
particular NewsItem in informal exchanges and communications, or as
part of a user interface.]
3. /NewsML/NewsItem/Identification/NewsIdentitifier/NewsItemId
[The NewsItemId is an identifier for the NewsItem. The combination of
NewsItemId and DateId must be unique among NewsItems that emanate from
the same provider. Within these constraints, the NewsItemId can take
any form the provider wishes. It may take the form of a name for the
NewsItem that will be meaningful to humans, but this is not a requirement
thanks,
-Masood
VTD-XML 2.3 is now released. To download the latest version please
visit http://sourceforge.net/project/showfiles.php?
group_id=110612&package_id=120172.
Below is a list of new features and enhancements in this version.
*VTDException is now introduced as the root class for all other VTD-
XML's exception classes (per suggestion of Max Rahder).
*Transcoding capability is now added for inter-document cut and
paste. You can cut a chuck of bytes in a UTF-8 encoded document and
paste it into a UTF-16 encoded document and the output document is
still well-formed.
*ISO-8859-10, ISO-8859-11, ISO-8859-12, ISO-8859-13, ISO-8859-14 and
ISO-8859-15 support has now been added
*Zero length Text node is now possible.
*Ability to dump in-memory copy of text is added.
*Various code cleanup, enhancement and bug fixes.
Below are some new articles related to VTD-XML
Index XML documents with VTD-XML http://xml.sys-
con.com/read/453082.htm
Manipulate XML content the Ximple Way
http://www.devx.com/xml/Article/36379
VTD-XML: A new vision of XML
http://www.developer.com/xml/article.php/3714051
VTD-XML: XML Processing for the future
http://www.codeproject.com/KB/cs/vtd-xml_examples.aspx
If you (or someone you know) like the concept of VTD-XML, think that
it can help solve enterprises' XML processing related issues
(particularly those related to SOA), and would like to directly
influence and contribute to the development of the future of
Internet, please email me (crackeur@...). We are looking for
open source software developers and project management people to take
VTD-XML to the next level.
>
> Would this 1.3 be an XML Schema only version? Will we not have the
same
> issue with compatibility?
>
[llm : ] either it would be a schema only version or there would be
problems with compatibility. So yes this may not be the best solution in
fact.
Laurent
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
This is one of the reasons Takahiro and I have been against adding the
Namespace, but there is a push within the IPTC to have all of our XSD include a
namespace and therefore the only way to release one for NewsML1.x
And no, there has been no feedback from outside the WP or the initial test
group, and the conclusion, well that may or may not be presented at the Fall
meeting. We are waiting for some results to come in from some re-testing with
the namespace after we provided an XSL for the test group to update instances
created using the DTD, when persisting in storage created using the XSD.
We hope to provide the results of this testing at the fall meeting during the
NewsML1.x MWP session.
Regarding 1.3, we had talked about it but could not see us getting much support
for this as the current attention is in releasing the G2 standards.
Would this 1.3 be an XML Schema only version? Will we not have the same issue
with compatibility?
j
>>> laurent.lemeur@... 09/24/07 7:32 AM >>>
Did this question had some conclusion?
Could the following alternative solution be studied?
Because moving all elements of NewsML 1.x to a namespace is practically making
the implementation non backward compatible, the result of the move is called
NewsML 1.3.
We know that - formally speaking, because it is not backward compatible - it
should be called 2.0, but the ambiguity with NewsML-G2 stops us choosing such
version number.
Laurent
> -----Message d'origine-----
> De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part de
> Jayson Lorenzen
> Envoyé : lundi 4 juin 2007 20:47
> À : newsml@yahoogroups.com
> Objet : [newsml] NewsML 1.2 XML Schema and namespace
>
> This post is partially in response to messages 2443 and 2444 partially
> a request for input from the community. We have been working on an XML
> Schema for NewsML 1.2, yet have an issue reported by early testers
> regarding the use of an XML namespace. There was a lengthy
> conversation regarding this issue at our meeting this past week in
> Tokyo Japan, though it remains unresolved and any input or assistance
> is welcome.
>
> So first, in response to the two recent messages, yes the namespace,
> should it be adopted would be:
>
> http://iptc.org/std/NewsML/2003-10-10/
>
> However there seems to be a compatibility issue when this is used, as
> instances created with the namespace are technically not the same as
> instances created without the namespace. The early testers of the XML
> Schema found that if a database schema is generated using the XML
> Schema with the namespace, instance documents created without the
> namespace are not able to be insert into this database schema without
> errors.
>
> Several options for the release of this XML schema have been
> discussed, including:
>
> - release the schema without a namespace
> - release the schema with a namespace
> - release the schema with a namespace and provide an XSL Template to
> add the namespace to instance documents
>
> We are interested in hearing of your experiences and in receiving any
> input you have regarding this issue.
>
> Thank you
>
> Jayson
>
>
>
> Find more on NewsML at http://www.newsml.org
>
> Any member of this IPTC moderated Yahoo group must comply with the
> Intellectual Property Policy of the IPTC, available at
> http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted under
> the conditions of this IPTC IP Policy.
>
> Yahoo! Groups Links
>
>
>
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
Did this question had some conclusion?
Could the following alternative solution be studied?
Because moving all elements of NewsML 1.x to a namespace is practically making
the implementation non backward compatible, the result of the move is called
NewsML 1.3.
We know that - formally speaking, because it is not backward compatible - it
should be called 2.0, but the ambiguity with NewsML-G2 stops us choosing such
version number.
Laurent
> -----Message d'origine-----
> De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part de
> Jayson Lorenzen
> Envoyé : lundi 4 juin 2007 20:47
> À : newsml@yahoogroups.com
> Objet : [newsml] NewsML 1.2 XML Schema and namespace
>
> This post is partially in response to messages 2443 and 2444 partially
> a request for input from the community. We have been working on an XML
> Schema for NewsML 1.2, yet have an issue reported by early testers
> regarding the use of an XML namespace. There was a lengthy
> conversation regarding this issue at our meeting this past week in
> Tokyo Japan, though it remains unresolved and any input or assistance
> is welcome.
>
> So first, in response to the two recent messages, yes the namespace,
> should it be adopted would be:
>
> http://iptc.org/std/NewsML/2003-10-10/
>
> However there seems to be a compatibility issue when this is used, as
> instances created with the namespace are technically not the same as
> instances created without the namespace. The early testers of the XML
> Schema found that if a database schema is generated using the XML
> Schema with the namespace, instance documents created without the
> namespace are not able to be insert into this database schema without
> errors.
>
> Several options for the release of this XML schema have been
> discussed, including:
>
> - release the schema without a namespace
> - release the schema with a namespace
> - release the schema with a namespace and provide an XSL Template to
> add the namespace to instance documents
>
> We are interested in hearing of your experiences and in receiving any
> input you have regarding this issue.
>
> Thank you
>
> Jayson
>
>
>
> Find more on NewsML at http://www.newsml.org
>
> Any member of this IPTC moderated Yahoo group must comply with the
> Intellectual Property Policy of the IPTC, available at
> http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted under
> the conditions of this IPTC IP Policy.
>
> Yahoo! Groups Links
>
>
>
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
Dear NewsML-G2 interest group,
The Experimental Phase 1 package for NewsML-G2 is out and can be
obtained
from:
http://www.iptc.org/std-dev/NewsML-G2/1.0/DRAFT-NewsML-G2_1.0_EP1.zip
More information is available at: http://www.iptc.org/dev/#newsml
The Experimental Phase closes on 31 August 2007.
Prior to this release the structures of IPTC News Architecture (NAR) 1.0
have been approved, more information and the specs can be obtained from
http://www.iptc.org/NAR
Public discussion about this work should be held on the following Yahoo!
Group:
http://tech.groups.yahoo.com/group/newsml-g2/
Best regards
Laurent Le Meur
IPTC News Architecture WP chair
IPTC NewsML-G2 WG chair
Head of AFP Medialab
This e-mail, and any file transmitted with it, is confidential and intended
solely for the use of the individual or entity to whom it is addressed. If you
have received this email in error, please contact the sender and delete the
email from your system. If you are not the named addressee you should not
disseminate, distribute or copy this email.
For more information on Agence France-Presse, please visit our web site at
http://www.afp.com
We are evaluating usage of either NITF or XHTML for story text markup
within our platforms for internal consumption/archive as well as
distrbuting News data to consumers. What has been experience of
members here for application of either markup? On what use-cases, one
is better over other?
Much appreciate your thoughts on this.
Regards
Calvin Epstein
This post is partially in response to messages 2443 and 2444 partially
a request for input from the community. We have been working on an XML
Schema for NewsML 1.2, yet have an issue reported by early testers
regarding the use of an XML namespace. There was a lengthy
conversation regarding this issue at our meeting this past week in
Tokyo Japan, though it remains unresolved and any input or assistance
is welcome.
So first, in response to the two recent messages, yes the namespace,
should it be adopted would be:
http://iptc.org/std/NewsML/2003-10-10/
However there seems to be a compatibility issue when this is used, as
instances created with the namespace are technically not the same as
instances created without the namespace. The early testers of the XML
Schema found that if a database schema is generated using the XML
Schema with the namespace, instance documents created without the
namespace are not able to be insert into this database schema without
errors.
Several options for the release of this XML schema have been
discussed, including:
- release the schema without a namespace
- release the schema with a namespace
- release the schema with a namespace and provide an XSL Template to
add the namespace to instance documents
We are interested in hearing of your experiences and in receiving any
input you have regarding this issue.
Thank you
Jayson