正则表达式匹配一个HTML输入的所有文本内容 | the

the

无法执行目标org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test

javascript中的setTimeout与setInterval

如何在Rails 2.3.5中安装/使用Devise？

如何在 1 秒内发送 4000 多个请求?

Java EE 6：JSF与Servlet + JSP。我应该学习JSF吗？

IntelliJ IDEA中的自动错误检测

如何将单个静态值绑定到gridview？

java的varargs是否有C ++ 14的等价物？

Maven版本插件问题-无法执行目标org.apache.maven.plugins:maven-release-plugin:2.3.2:prepare

Delphi自动格式化源码破坏匿名程序

reportviewer无法正常工作.

将国家/地区名称翻译成其他语言

Spring MVC配置url-pattern

树算法实现C＃

模板类中的模板函数

正则表达式匹配一个HTML输入的所有文本内容

扫码查看

本文介绍了正则表达式匹配一个HTML输入的所有文本内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的文章在我的网站，我想获得纠正，并自动翻译。但我需要的内容，周围没有具有HTML标记。

I have articles on my website which I would like to get corrected and translated automatically. But I need to get the content, without having the HTML tags around.

这个想法是有一个可以检索所有的标签之间的内容（如果可能的话，正则表达式还发现，在标签领域，如＆LT的内容; IMG ALT ='小家'＆GT; ）。问题是，我真的不知道该怎么写这样的正则表达式。任何想法？

The idea is to have a regex that could retrieve all the content between the tags (and, if possible, also the content found in tags fields like <img alt='Little house'>). The problem is that I don't really know how to write such a regex. Any ideas?

推荐答案

我会建议使用 HTML解析器，而不是依靠一个正则表达式。与正则表达式解析HTML通常是一个没有没有，而且几乎不可能得到正确的所有情况。有很多问题在这里上，这样在相同的结论。

I would recommend using an HTML parser, rather than relying on a regex. Parsing HTML with regex is generally a no-no and are nearly impossible to get right for all cases. There are many questions here on SO that arrive at the same conclusion.

修改看起来像情侣我们有同样的想法...此外，的，讨论更多的解析器。

EDIT looks like a couple of us had the same idea... Also, here is a question that discusses more parsers.

这篇关于正则表达式匹配一个HTML输入的所有文本内容的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

05-27 07:28