本文介绍了正则表达式匹配至少n个单词的句子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从PHP中至少包含5个单词的文本中提取所有句子.假设句子以句号,问号或感叹号结尾,我想到了:

I'm trying to pull all sentences from a text that consist of, say, at least 5 words in PHP. Assuming sentences end with full stop, question or exclamation mark, I came up with this:

 /[\w]{5,*}[\.|\?|\!]/

任何想法,怎么了?

此外,要使用UTF-8,需要做些什么?

Also, what needs to be done for this to work with UTF-8?

推荐答案

\w仅匹配单个字符.一个单词是\w+.如果您至少需要5个字,则可以执行以下操作:

\w only matches a single character. A single word would be \w+. If you need at least 5 words, you could do something like:

/(\w+\s){4,}\w+[.?!]/

即至少4个单词,后跟空格,然后是另一个单词,再加上句子定界符.

i.e. at least 4 words followed by spaces, followed by another word followed by a sentence delimiter.

这篇关于正则表达式匹配至少n个单词的句子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-24 02:03