正则表达式，使用JavaScript进行准确的字数统计

本文介绍了正则表达式，使用JavaScript进行准确的字数统计的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试为JavaScript命令组合一个正则表达式，以准确计算textarea中的单词数。

I'm trying to put together a regular expression for a JavaScript command that accurately counts the number of words in a textarea.

我找到的一个解决方案是如下：

One solution I had found is as follows:

document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\b\w+\b/).length -1;

但这不包括任何非拉丁字符（例如：西里尔字母，韩文字母等）;它完全跳过它们。

But this doesn't count any non-Latin characters (eg: Cyrillic, Hangul, etc); it skips over them completely.

我放在一起的另一个：

document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\s+/g).length -1;

但除非文档以空格字符结尾，否则这不会准确计算。如果空格字符附加到计数值，即使空文档也计算1个单词。此外，如果文档以空格字符开头，则会计算一个无关的单词。

But this doesn't count accurately unless the document ends in a space character. If a space character is appended to the value being counted it counts 1 word even with an empty document. Furthermore, if the document begins with a space character an extraneous word is counted.

我是否可以在此命令中使用正则表达式来准确计算单词，无论输入法？

Is there a regular expression I can put into this command that counts the words accurately, regardless of input method?

推荐答案

这应该做你想要的事情：

This should do what you're after:

value.match(/\S+/g).length;

不是拆分字符串，而是匹配任何非空白字符序列。

Rather than splitting the string, you're matching on any sequence of non-whitespace characters.

如果需要，还可以轻松提取每个单词;）

There's the added bonus of being easily able to extract each word if needed ;)

这篇关于正则表达式，使用JavaScript进行准确的字数统计的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！