问题描述
我正在尝试为JavaScript命令组合一个正则表达式,以准确计算textarea中的单词数。
I'm trying to put together a regular expression for a JavaScript command that accurately counts the number of words in a textarea.
我找到的一个解决方案是如下:
One solution I had found is as follows:
document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\b\w+\b/).length -1;
但这不包括任何非拉丁字符(例如:西里尔字母,韩文字母等);它完全跳过它们。
But this doesn't count any non-Latin characters (eg: Cyrillic, Hangul, etc); it skips over them completely.
我放在一起的另一个:
document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\s+/g).length -1;
但除非文档以空格字符结尾,否则这不会准确计算。如果空格字符附加到计数值,即使空文档也计算1个单词。此外,如果文档以空格字符开头,则会计算一个无关的单词。
But this doesn't count accurately unless the document ends in a space character. If a space character is appended to the value being counted it counts 1 word even with an empty document. Furthermore, if the document begins with a space character an extraneous word is counted.
我是否可以在此命令中使用正则表达式来准确计算单词,无论输入法?
Is there a regular expression I can put into this command that counts the words accurately, regardless of input method?
推荐答案
这应该做你想要的事情:
This should do what you're after:
value.match(/\S+/g).length;
不是拆分字符串,而是匹配任何非空白字符序列。
Rather than splitting the string, you're matching on any sequence of non-whitespace characters.
如果需要,还可以轻松提取每个单词;)
There's the added bonus of being easily able to extract each word if needed ;)
这篇关于正则表达式,使用JavaScript进行准确的字数统计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!