问题描述
我使用NSXMLParser解析一些HTML,并且它遇到一个&符号时会遇到一个解析器错误。我可以过滤掉&符号,然后我解析它,但我宁愿解析那里的一切。
I'm parsing some HTML with NSXMLParser and it hits a parser error anytime it encounters an ampersand. I could filter out ampersands before I parse it, but I'd rather parse everything that's there.
它给我错误68,NSXMLParserNAMERequiredError:需要名称。
It's giving me error 68, NSXMLParserNAMERequiredError: Name is required.
我最好的猜测是它是一个字符设置问题。我对字符集的世界有点模糊,所以我认为我的无知正在咬我的屁股。
源HTML使用charset iso-8859-1,所以我使用这个代码来初始化解析器:
My best guess is that it's a character set issue. I'm a little fuzzy on the world of character sets, so I'm thinking my ignorance is biting me in the ass.The source HTML uses charset iso-8859-1, so I'm using this code to initialize the Parser:
NSString *dataString = [[[NSString alloc] initWithData:data encoding:NSISOLatin1StringEncoding] autorelease];
NSData *dataEncoded = [[dataString dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES] autorelease];
NSXMLParser *theParser = [[NSXMLParser alloc] initWithData:dataEncoded];
任何想法?
推荐答案
对于其他海报:当然XML是无效的...它是HTML!
To the other posters: of course the XML is invalid... it's HTML!
你可能不应该试图使用NSXMLParser HTML,而是
You probably shouldn't be trying to use NSXMLParser for HTML, but rather libxml2
要详细了解原因,请查看。
For a closer look at why, check out this article.
这篇关于NSXMLParser扼流符号&的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!