问题描述
我有一个包含从URL检索,因此我不得不EN code它的一些阿拉伯字符的XML文件的UTF-8,因此它可以处理这样的人物。
I have an XML File containing some Arabic Characters retrieved from a URL so I had to encode it in UTF-8 so it can handle such characters.
XML文件:
<Entry>
<lstItems>
<item>
<id>1</id>
<title>News Test 1</title>
<subtitle>16/7/2012</subtitle>
<img>joelle.mobi-mind.com/imgs/news1.jpg</img>
</item>
<item>
<id>2</id>
<title>كريم</title>
<subtitle>16/7/2012</subtitle>
<img>joelle.mobi-mind.com/imgs/news2.jpg</img>
</item>
<item>
<id>3</id>
<title>News Test 333</title>
<subtitle>16/7/2012</subtitle>
<img>joelle.mobi-mind.com/imgs/news3.jpg</img>
</item>
<item>
<id>4</id>
<title>ربيع</title>
<subtitle>16/7/2012</subtitle>
<img>joelle.mobi-mind.com/imgs/cont20.jpg</img>
</item>
<item>
<id>5</id>
<title>News Test 55555</title>
<subtitle>16/7/2012</subtitle>
<img>joelle.mobi-mind.com/imgs/cont21.jpg</img>
</item>
<item>
<id>6</id>
<title>News Test 666666</title>
<subtitle>16/7/2012</subtitle>
<img>joelle.mobi-mind.com/imgs/cont22.jpg</img>
</item>
</lstItems>
</Entry>
我解析从URL作为字符串检索的XML,如下所示:
I parsed the XML retrieved from a URL it as String as shown below:
public String getXmlFromUrl(String url) {
try {
return new AsyncTask<String, Void, String>() {
@Override
protected String doInBackground(String... params) {
//String xml = null;
try {
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpGet httpPost = new HttpGet(params[0]);
HttpResponse httpResponse = httpClient.execute(httpPost);
HttpEntity httpEntity = httpResponse.getEntity();
xml = new String(EntityUtils.toString(httpEntity).getBytes(),"UTF-8");
} catch (Exception e) {
e.printStackTrace();
}
return xml;
}
}.execute(url).get();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ExecutionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return xml;
}
现在返回的字符串传递给此方法得到一个文件以备以后使用,如下所示:
Now the returned String is passed to this method to get a Document for later use as shown below:
public Document getDomElement(String xml){
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
StringReader xmlstring=new StringReader(xml);
is.setCharacterStream(xmlstring);
is.setEncoding("UTF-8");
//Code Stops here !
doc = db.parse(is);
} catch (ParserConfigurationException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (SAXException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (IOException e) {
Log.e("Error: ", e.getMessage());
return null;
}
// return DOM
return doc;
}
ocured与此消息的错误:
an Error ocured with this message:
09-18 07:51:40.441: E/Error:(1210): Unexpected token (position:TEXT @1:4 in java.io.StringReader@4144c240)
所以,code崩溃,我上面显示,出现以下错误
So the code crashes where I showed above with the following Error
09-18 07:51:40.451: E/AndroidRuntime(1210): java.lang.RuntimeException: Unable to start activity ComponentInfo{com.example.university1/com.example.university1.MainActivity}: java.lang.NullPointerException
请注意,code正常工作与ISO编码。
Kindly note that the code works fine with ISO encoding.
推荐答案
您已经添加了一个 BOM 在UTF-8的文件。这是坏的。
You've added a BOM in your UTF-8 file. Which is bad.
也许你用记事本编辑的文件,或者你应该检查你的编辑器,以确保它不添加一个BOM。
由于BOM好像是里面的文字,而不是在开始时,你也需要通过围绕其位置delete键删除它(这是不可见的,在大多数编辑)。这可能文件串联操作过程中发生的。
这篇关于解析UTF-8 Encodded XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!