需要一个很好的正则表达式来将URL转换为链接，但只保留现有的链接

但只保留现有的链接

关注

发信

关注(28)粉丝(399)

需要一个很好的正则表达式来将URL转换为链接，但只保留现有的链接

本文介绍了需要一个很好的正则表达式来将URL转换为链接，但只保留现有的链接的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一大堆用户提交的内容。它是HTML，可能包含URL。其中一些会是< a> 的（如果用户很好），但有时用户很懒，只需输入www.something.com或者最好。

我可以找到一个体面的正则表达式来捕获URL，但忽略那些立即在双引号或'>'右边的。

=noreferrer> RegexBuddy ，有，这个博客解决了Jeff所提出的问题，并提供了一个很好的解决方案。

  \b（？：（？：https？| ftp | file）：// | www\。| ftp \。）[ -  A-Z0-9 +& @＃/％=〜_ | $？！：，。] * [A-Z0-9 +& @＃/％=〜_ | $]

为了忽略紧挨着or>的匹配，可以添加（？<！[>]）到正则表达式的开头，所以你得到

  \b（小于; [>]？！）（：？？？（：HTTPS | FTP |文件）：// | www\ | ftp\。）[-A-Z0 -9 +& @＃/％=〜_ | $？！：，。] * [A-Z0-9 +& @＃/％=〜_ | $]

这将匹配完整地址（。）以及以www。或ftp开头的地址 - 你的地址不像ars.userfriendly.org ... $ / b>

I have a load of user-submitted content. It is HTML, and may contain URLs. Some of them will be <a>'s already (if the user is good) but sometimes users are lazy and just type www.something.com or at best http://www.something.com.

I can't find a decent regex to capture URLs but ignore ones that are immediately to the right of either a double quote or '>'. Anyone got one?

解决方案

Jan Goyvaerts, creator of RegexBuddy, has written a response to Jeff Atwood's blog that addresses the issues Jeff had and provides a nice solution.

\b(?:(?:https?|ftp|file)://|www\.|ftp\.)[-A-Z0-9+&@#/%=~_|$?!:,.]*[A-Z0-9+&@#/%=~_|$]

In order to ignore matches that occur right next to a " or >, you could add (?<![">]) to the start of the regex, so you get

(?<![">])\b(?:(?:https?|ftp|file)://|www\.|ftp\.)[-A-Z0-9+&@#/%=~_|$?!:,.]*[A-Z0-9+&@#/%=~_|$]

This will match full addresses (http://...) and addresses that start with www. or ftp. - you're out of luck with addresses like ars.userfriendly.org...

这篇关于需要一个很好的正则表达式来将URL转换为链接，但只保留现有的链接的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

08-21 06:58