问题描述
如何获取Go中字符串的字符数?
例如,如果我有一个字符串hello
该方法应该返回 5
。我看到 len(str)
返回字节数而不是字符数,因此 len )
返回2而不是1,因为£在UTF-8中用两个字节编码。
您可以从utf8软件包尝试。 / p>
,如中所示:世界的长度可能为6(以中文书写:世界),但其符文数为2:
package main
importfmt
importunicode / utf8
func main(){
fmt.Println(Hello,World,len(世界),utf8.RuneCountInString )
}
在评论中添加指向博客文章中的文字正常化
什么是字符?
使用该软件包及其,character的实际数字将是:
package main
importfmt
importgolang.org/x/text/unicode/norm
func main(){
var ia norm.Iter
ia.InitString(norm.NFKD,école)
nc:= 0
for!ia.Done
nc = nc + 1
ia.Next()
}
fmt.Printf(字符数:%d \\\
,nc)
}
这里使用 NFKD兼容性分解
How can I get the number of characters of a string in Go?
For example, if I have a string "hello"
the method should return 5
. I saw that len(str)
returns the number of bytes and not the number of characters so len("£")
returns 2 instead of 1 because £ is encoded with two bytes in UTF-8.
You can try RuneCountInString
from the utf8 package.
that, as illustrated in this script: the length of "World" might be 6 (when written in Chinese: "世界"), but its rune count is 2:
package main
import "fmt"
import "unicode/utf8"
func main() {
fmt.Println("Hello, 世界", len("世界"), utf8.RuneCountInString("世界"))
}
Phrozen adds in the comments:
Actually you can do len()
over runes by just type casting.len([]rune("世界"))
will print 2
. At leats in Go 1.3.
Stefan Steiger points to the blog post "Text normalization in Go"
What is a character?
Using that package and its Iter
type, the actual number of "character" would be:
package main
import "fmt"
import "golang.org/x/text/unicode/norm"
func main() {
var ia norm.Iter
ia.InitString(norm.NFKD, "école")
nc := 0
for !ia.Done() {
nc = nc + 1
ia.Next()
}
fmt.Printf("Number of chars: %d\n", nc)
}
Here, this uses the Unicode Normalization form NFKD "Compatibility Decomposition"
这篇关于如何获取字符串中的字符数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!