本文介绍了正则表达式匹配至少n个单词的句子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试从PHP中至少包含5个单词的文本中提取所有句子.假设句子以句号,问号或感叹号结尾,我想到了:
I'm trying to pull all sentences from a text that consist of, say, at least 5 words in PHP. Assuming sentences end with full stop, question or exclamation mark, I came up with this:
/[\w]{5,*}[\.|\?|\!]/
任何想法,怎么了?
此外,要使用UTF-8,需要做些什么?
Also, what needs to be done for this to work with UTF-8?
推荐答案
\w
仅匹配单个字符.一个单词是\w+
.如果您至少需要5个字,则可以执行以下操作:
\w
only matches a single character. A single word would be \w+
. If you need at least 5 words, you could do something like:
/(\w+\s){4,}\w+[.?!]/
即至少4个单词,后跟空格,然后是另一个单词,再加上句子定界符.
i.e. at least 4 words followed by spaces, followed by another word followed by a sentence delimiter.
这篇关于正则表达式匹配至少n个单词的句子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!