问题描述
我为Firefox做了一个附加组件,它使用ajax加载一个html页面(附加组件是XUL面板)。
现在,我没有搜索创建文档
对象并将ajax请求内容放入其中,然后使用xPath来查找我需要的方法。
相反,我正在加载的内容和解析它作为文本与正则expresion。
但我有一个问题。哪个更好用,xPath或正则表达式?哪一个更快执行?
HTML页面将包含数百个包含相同文本的元素,我基本上想要做的是统计有多少元素。
我希望我的插件能够尽可能快地工作,而且我不知道regexp或xPath背后的机制,所以我不知道哪一个更有效。
希望我清楚。谢谢
无论您何时处理XML,都可以使用XPath(或XSLT,XQuery,SAX,DOM或任何其他可识别XML方法来通过你的数据)。 。
为什么? XML处理是错综复杂的,处理所有的怪事,外部/分析/未分析的实体,DTD的,处理指令,空白处理,崩溃,unicode标准化,CDATA部分等等,使得它很难创建一个可靠的正则表达式获取您的数据的方式。只要考虑到已经花了几年的时间来学习如何最好地解析XML,应该有足够的理由不要自己去做这件事。
>
您写了:
为您带来最快速的可靠和稳定的实施。使用XPath。这是什么在Firefox和其他浏览器中使用,以及如果您需要您的代码从浏览器运行。
I am making an add-on for firefox and it loads a html page using ajax (add-on has it's XUL panel).
Now at this point, i did not search for a ways of creating a document
object and placing the ajax request contents into it and then using xPath to find what i need.
Instead i am loading the contents and parsing it as text with regular expresion.
But i got a question. Which would be better to use, xPath or regular expression? Which is faster to perform?
The HTML page would consist of hundreds of elements which contain same text, and what i basically want to do is count how many elements are there.
I want my add-on to work as fast as possible and i do not know the mechanics behind regexp or xPath, so i don't know which is more effective.
Hope i was clear. Thanks
Whenever you are dealing with XML, use XPath (or XSLT, XQuery, SAX, DOM or any other XML-aware method to go through your data). Do never use regular expressions for this task.
Why? XML processing is intricate and dealing with all its oddities, external/parsed/unparsed entities, DTD's, processing instructions, whitespace handling, collapsing, unicode normalization, CDATA sections etc makes it very hard to create a reliable regex-way of getting your data. Just consider that it has taken the industry years to learn how to best parse XML, should be enough reason not to try to do this by yourself.
Answering your q.: when it comes to speed (which should not be your primary concern here), it highly depends on the implementation of either the XPath or Regex compiler / processor. Sometimes, XPath will be faster (i.e., when using keys, if possible, or compiled XSLT), other times, regexes will be faster (if you can use a precompiled regex and your query is easy). But regexes are never easy with HTML/XML simply because of the matching nested parentheses (tags) problem, which cannot be reliably solved with regexes alone.
If input is huge, regex will tend to be faster, unless the XPath implementation can do streaming processing (which I believe is not the method inside Firefox).
You wrote:
the one that brings you quickest to a reliable and stable implementation that's comparatively speedy. Use XPath. It's what's used inside Firefox and other browsers as well if you need your code to run from a browser.
这篇关于哪个更快,XPath或Regexp?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!