如何使用 XPath 正则表达式匹配 URL

本文介绍了如何使用 XPath 正则表达式匹配 URL的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

需要有关 XPath 的帮助.我有这样一个 XML:

Need help with XPath. I have such a XML:

   <unaryExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
      <postfixExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
        <leftHandSideExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
          <newExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
            <memberExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
              <primaryExpression tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
                <literal tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
                  <stringLiteral tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8">
                    <LITERAL tokenValue="'http://google.com'" tokenLine="1" tokenColumn="8"/>
                  </stringLiteral>
                </literal>
              </primaryExpression>
            </memberExpression>
          </newExpression>
        </leftHandSideExpression>
      </postfixExpression>
    </unaryExpression>

我需要找到网址.我这样做.

I need to find the URL. I do it so.

//LITERAL[contains(@tokenValue, 'http://')]

如何使用正则表达式查找url?

How to use a regular expression to find url?

(http://|https://|ftp://)([a-z0-9]{1})((\.[a-z0-9-])|([a-z0-9-]))*\.([a-z]{2,4})(\/?)

推荐答案

如果您的 XPath 引擎支持 XPath 2.0，请使用 fn:matches ，它等效于 fn:contains 用于常规表达式.XPath 1.0 不支持正则表达式.

If your XPath engine supports XPath 2.0, use fn:matches which equivalents fn:contains for regular expressions. With XPath 1.0, there is no support for regular expressions.

//LITERAL[fn:matches(@tokenValue, '(http://|https://|ftp://)([a-z0-9]{1})((\.[a-z0-9-])|([a-z0-9-]))*\.([a-z]{2,4})(/?)')]

将返回所有 <LITERAL/>-tags 具有与您的正则表达式匹配的 @tokenValue-tag.

Will return all <LITERAL/>-tags having an @tokenValue-tag matching your regular expression.

您的表达式存在一些问题，您不必(也可能不会)转义最后一个匹配组中的 /.我在我的查询中修复了这个问题.你为什么要使用最后两个匹配组?

There is some problem in your expression, you don't have to (and may not) escape the / in the last match group. I fixed that in my query. Why are you using the last two match groups anyway?

这篇关于如何使用 XPath 正则表达式匹配 URL的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！