JDom进行XML解析时的一个小错误以
在使用JDom进行XML分析的时候,遇到如下错误:
java.net.SocketException: Unexpected end of file from server
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:453)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:891)
at cn.edu.ruc.web.wrappers.JDomWrapper.getRootElement(JDomWrapper.java:36)
at cn.edu.ruc.web.wrappers.JDomWrapper.getText(JDomWrapper.java:75)
at cn.edu.ruc.web.WebsiteXMLBuild.main(WebsiteXMLBuild.java:131)
起初以为是文件命名方式的问题,导致JDom分析此文件时,误以为是在线文件,然后去上网下载此待分析的XML文件,结果下载不到,于是抛出这个链接异常。后来经过尝试发现,无论如何修改文件名,都会抛出此异常。
于是检查XML文件内容,发现是在XML文件的顶部,有一个HTML文件最常用的DTD格式网址:
代码如下 | 复制代码 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" |
于是恍然大悟:原来是JDom根据这个链接去爬取了这个loose.dtd的文件,导致错误发生。
解决方法:将上述语句从XML文件中去除,JDom即可正常地分析这个XML文件了。
补充:Jsp教程,Java与XML