问题描述
我已经看到了几件事,但是到目前为止似乎还没有任何效果。我在rails 3 ruby 1.9.2上使用nokogiri通过URL解析xml。
I have seen several things on this, but nothing has seemed to work so far. I am parsing an xml via a url using nokogiri on rails 3 ruby 1.9.2.
xml片段如下:
<NewsLineText>
<![CDATA[
Anna Kendrick is ''obsessed'' with 'Game of Thrones' and loves to cook, particularly creme brulee.
]]>
</NewsLineText>
我正在尝试将其解析以获取与NewsLineText相关的文本
I am trying to parse this out to get the text associated with the NewsLineText
r = node.at_xpath('.//newslinetext') if node.at_xpath('.//newslinetext')
s = node.at_xpath('.//newslinetext').text if node.at_xpath('.//newslinetext')
t = node.at_xpath('.//newslinetext').content if node.at_xpath('.//newslinetext')
puts r
puts s ? if s.blank? 'NOTHING' : s
puts t ? if t.blank? 'NOTHING' : t
我得到的回报是
<newslinetext></newslinetext>
NOTHING
NOTHING
所以我知道我的标签正确命名/拼写为获取新闻行文本数据,但cdata文本永远不会显示。
So I know my tags are named/spelled correctly to get at the newslinetext data, but the cdata text never shows up.
我需要对nokogiri做什么以获取此文本?
What do I need to do with nokogiri to get this text?
推荐答案
您正尝试使用Nokogiri的HMTL解析器来解析XML。如果 node
来自XML解析器,则 r
将为 nil
由于XML区分大小写;您的 r
不是 nil
,因此您使用的HTML解析器不区分大小写。
You're trying to parse XML using Nokogiri's HMTL parser. If node
as from the XML parser then r
would be nil
since XML is case sensitive; your r
is not nil
so you're using the HTML parser which is case insensitive.
使用Nokogiri的XML解析器,您将得到以下内容:
Use Nokogiri's XML parser and you will get things like this:
>> r = doc.at_xpath('.//NewsLineText')
=> #<Nokogiri::XML::Element:0x8066ad34 name="NewsLineText" children=[#<Nokogiri::XML::Text:0x8066aac8 "\n ">, #<Nokogiri::XML::CDATA:0x8066a9c4 "\n Anna Kendrick is ''obsessed'' with 'Game of Thrones' and loves to cook, particularly creme brulee.\n ">, #<Nokogiri::XML::Text:0x8066a8d4 "\n">]>
>> r.text
=> "\n \n Anna Kendrick is ''obsessed'' with 'Game of Thrones' and loves to cook, particularly creme brulee.\n \n"
,您将可以通过 r.text
或 r.children 。
这篇关于试图使用nokogiri在xml文件中的cdata标签内获取内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!