Search the web
Sign In
New User? Sign Up
jena-dev · Jena Developers
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Best of Y! Groups

   Check them out and nominate your group.
Click here for the latest updates on Groups Message search

Messages

  Messages Help
Advanced
GRDDL - Server returned HTTP response code: 503   Topic List   < Prev Topic  |  Next Topic >
Reply < Prev Message  | 
Hey All,

I have a problem parsing websites with GRDDL.

I use this code:

Model m = ModelFactory.createDefaultModel();
RDFReader r = m.getReader("GRDDL");
r.setProperty("grddl.rdfa", true);
r.read(m, "some website...");

And get the exception pasted at the end.

Most of the times when I download the html file locally and delete the first
line about DOCTYPE it works just fine. But how can I overcome this?

Thanks,
Stijn

----

Exception output:

null
ERROR [main] (RDFDefaultErrorHandler.java:40) - java.io.IOException: Server
returned HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
ERROR [main] (RDFDefaultErrorHandler.java:40) - java.io.IOException: Server
returned HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
Exception in thread "main" net.sf.saxon.trans.DynamicError: java.io.IOException:
Server returned HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:313)
at net.sf.saxon.event.Sender.send(Sender.java:142)
at net.sf.saxon.IdentityTransformer.transform(IdentityTransformer.java:29)
at com.hp.hpl.jena.grddl.impl.GRDDL.initialParse(GRDDL.java:234)
at com.hp.hpl.jena.grddl.impl.GRDDL.go(GRDDL.java:199)
at com.hp.hpl.jena.grddl.GRDDLReader.read(GRDDLReader.java:47)
at sites.ParseSites.getModel(ParseSites.java:32)
at sites.ParseSites.queryWebSite(ParseSites.java:39)
at sites.ParseSites.main(ParseSites.java:20)
Caused by: java.io.IOException: Server returned HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.jav\
a:1313)
at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown
Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:300)
... 8 more
---------
java.io.IOException: Server returned HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.jav\
a:1313)
at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown
Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:300)
at net.sf.saxon.event.Sender.send(Sender.java:142)
at net.sf.saxon.IdentityTransformer.transform(IdentityTransformer.java:29)
at com.hp.hpl.jena.grddl.impl.GRDDL.initialParse(GRDDL.java:234)
at com.hp.hpl.jena.grddl.impl.GRDDL.go(GRDDL.java:199)
at com.hp.hpl.jena.grddl.GRDDLReader.read(GRDDLReader.java:47)
at sites.ParseSites.getModel(ParseSites.java:32)
at sites.ParseSites.queryWebSite(ParseSites.java:39)
at sites.ParseSites.main(ParseSites.java:20)
com.hp.hpl.jena.shared.JenaException: rethrew: net.sf.saxon.trans.DynamicError:
java.io.IOException: Server returned HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
at com.hp.hpl.jena.grddl.impl.GRDDL.initialParse(GRDDL.java:254)
at com.hp.hpl.jena.grddl.impl.GRDDL.go(GRDDL.java:199)
at com.hp.hpl.jena.grddl.GRDDLReader.read(GRDDLReader.java:47)
at sites.ParseSites.getModel(ParseSites.java:32)
at sites.ParseSites.queryWebSite(ParseSites.java:39)
at sites.ParseSites.main(ParseSites.java:20)
Caused by: net.sf.saxon.trans.DynamicError: java.io.IOException: Server returned
HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:313)
at net.sf.saxon.event.Sender.send(Sender.java:142)
at net.sf.saxon.IdentityTransformer.transform(IdentityTransformer.java:29)
at com.hp.hpl.jena.grddl.impl.GRDDL.initialParse(GRDDL.java:234)
... 5 more
Caused by: java.io.IOException: Server returned HTTP response code: 503 for URL:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.jav\
a:1313)
at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown
Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:300)
... 8 more





Wed Dec 9, 2009 12:51 pm

stijn35811
Offline Offline
Send Email Send Email

< Prev Message  | 
Expand Messages Author Sort by Date

Hey All, I have a problem parsing websites with GRDDL. I use this code: Model m = ModelFactory.createDefaultModel(); RDFReader r = m.getReader("GRDDL"); ...
stijn35811
Offline Send Email
Dec 9, 2009
12:52 pm

Hi Stijn, Even I got the same error and was unable to fix it nor did I get any response from anybody regarding the same. Can you clarify a point. Did you mean...
sashi kiran
cr_sashikiran
Offline Send Email
Dec 9, 2009
1:52 pm

Hey Sashikiran, I download the webpage itself and modify this. If there is a DOCTYPE tag, it is always the first one. And if you delete that tag, it usually...
stijn35811
Offline Send Email
Dec 11, 2009
7:41 pm

Thank you Stijn for ur reply. ... -- Sashikiran Challa School of Informatics and Computing Indiana University, Bloomington,IN,USA. "You are what you think you...
sashi kiran
cr_sashikiran
Offline Send Email
Dec 11, 2009
8:47 pm

Hi Sorry for the delay in responding. I have recently rejoined the Jena team - and I am just getting up to speed again. The relevant Javadoc is: ...
Jeremy Carroll
jeremy.carro...
Offline Send Email
Dec 13, 2009
4:35 am

... I do not have an answer to your question, sorry. But, if it is RDFa you are interested in, you might have a look at: http://github.com/shellac/java-rdfa ...
Paolo Castagna
castagna.lists@...
Send Email
Dec 9, 2009
2:47 pm
Advanced

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help