本文介绍了多字节字符是否干扰正则表达式中的终端字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
使用这个正则表达式:
regex1 = / \z /
/ pre>
以下字符串匹配:
hello= 〜regex1#=> 5
こんにちは=〜regex1#=> 5
但使用这些正则表达式:
regex2 = /#$ /?\z /
regex3 = / \\\
?\z /
他们显示差异:
hello=〜regex2#= > 5
hello=〜regex3#=> 5
こんにちは=〜regex2#=> nil
こんにちは=〜regex3#=> nil
什么是干扰?字符串编码是UTF-8,操作系统是Linux(即
$ /
是\\\
)。多字节字符是否干扰$ /
?如何?解决方案在,这个问题现在被接受为一个bug。希望它会被修复。
更新:Ruby trunk中已经发布了两个补丁。
With this regex:
regex1 = /\z/
the following strings match:
"hello" =~ regex1 # => 5 "こんにちは" =~ regex1 # => 5
but with these regexes:
regex2 = /#$/?\z/ regex3 = /\n?\z/
they show difference:
"hello" =~ regex2 # => 5 "hello" =~ regex3 # => 5 "こんにちは" =~ regex2 # => nil "こんにちは" =~ regex3 # => nil
What is interfering? The string encoding is UTF-8, and the OS is Linux (i.e.,
$/
is"\n"
). Are the multibyte characters interfering with$/
? How?解决方案In Ruby trunk, the issue has now been accepted as a bug. Hopefully, it will be fixed.
Update: Two patches have been posted in Ruby trunk.
这篇关于多字节字符是否干扰正则表达式中的终端字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!