问题描述
我当前遇到一个问题,其中元素从我的xml文件返回且带有单引号.这导致xml_parse分成多个块,例如:连线,您已被雇用!然后将其表示为"Get Wired,You"是一个对象,单引号是第二个对象,然后"re Hired!"第三.
I am currently running into a problem where an element is coming back from my xml file with a single quote in it. This is causing xml_parse to break it up into multiple chunks, example: Get Wired, You're Hired!Is then enterpreted as 'Get Wired, You' being one object, the single quote being a second, and 're Hired!' as a third.
我想做的是:
while($data = fread($fp, 4096)){
if(!xml_parse($xml_parser, htmlentities($data,ENT_QUOTES), feof($fp))) {
break;
}
}
但是这一直在打破.我可以运行一个str_replace来代替htmlentities,它可以正常运行,但不想使用htmlentities.
But that keeps breaking. I can run a str_replace in place of htmlentities and it runs without issue, but does not want to with htmlentities.
有什么想法吗?
更新:根据以下JimmyJ的回复,我尝试了以下解决方案,但没有遇到任何麻烦(仅供参考,在链接的帖子上方有一个或两个响应,用于更新直接链接的代码):
Update:As per JimmyJ's response below, I have attempted the following solution with no luck (FYI there is a response or two above the linked post that update the code that is linked directly):
function XMLEntities($string)
{
$string = preg_replace('/[^\x09\x0A\x0D\x20-\x7F]/e', '_privateXMLEntities("$0")', $string);
return $string;
}
function _privateXMLEntities($num)
{
$chars = array(
39 => ''',
128 => '€',
130 => '‚',
131 => 'ƒ',
132 => '„',
133 => '…',
134 => '†',
135 => '‡',
136 => 'ˆ',
137 => '‰',
138 => 'Š',
139 => '‹',
140 => 'Œ',
142 => 'Ž',
145 => '‘',
146 => '’',
147 => '“',
148 => '”',
149 => '•',
150 => '–',
151 => '—',
152 => '˜',
153 => '™',
154 => 'š',
155 => '›',
156 => 'œ',
158 => 'ž',
159 => 'Ÿ');
$num = ord($num);
return (($num > 127 && $num < 160) ? $chars[$num] : "&#".$num.";" );
}
if(!xml_parse($xml_parser, XMLEntities($data), feof($fp))) {
break;
}
更新:按照以下有关汤姆的问题,魔术引号确实已经关闭.
Update: As per tom's question below, magic quotes is/was indeed turned off.
解决方案:我最终要解决的问题是:
Solution: What I have ended up doing to solve the problem is the following:
在为每个单独的项目/帖子/等收集数据之后,我将该数据存储到稍后用于输出的数组中,然后清除收集期间使用的局部变量.我添加了一个步骤,检查数据是否已经存在,如果存在,我将其连接到末尾,而不是覆盖它.
After collecting the data for each individual item/post/etc, I store that data to an array that I use later for output, then clear the local variables used during collection. I added in a step that checks if data is already present, and if it is, I concatenate it to the end, rather than overwriting it.
因此,如果我最后得到三个块(如上所述,让我们继续坚持获取连线,您就被雇用了!",那么我将不做任何事情
So, if I end up with three chunks (as above, let's stick with 'Get Wired, You're Hired!', I will then go from doing
$x = 'Get Wired, You'
$x = "'"
$x = 're Hired!'
要做的事
$x = 'Get Wired, You' . "'" . 're Hired!'
这不是最佳解决方案,但似乎可以正常工作.
This isn't the optimal solution, but appears to be working.
推荐答案
为什么不使用诸如simplexml_load_file之类的文件轻松地解析文件?
Why don't you use something like simplexml_load_file to parse your file easily ?
这篇关于用单引号解析XML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!