问题描述
问题是我需要在客户端验证中将这种中文输入视为无效:
The thing is I need to treat this kind of Chinese input as invalid in client side validation:
当任何英文字符与中文字符和空格混合且总长度> = 10时,输入无效.
Input is invalid when any English character mixed with any Chinese character and spaces has a total length >=10.
让我们说:你的你的a你的你"或你的你的你的你"(长度为10)无效.但是你的a你的a你的a"(长度为9)是可以的.
Let's say : "你的a你的a你的a你" or "你的 你的 你的 你" (length is 10) is invalid. But "你的a你的a你的a" (length is 9) is OK.
我同时使用Javascript进行客户端验证和使用Java进行服务器端验证.因此,我认为对两者应用正则表达式应该是完美的.
I am using both Javascript to do client side validation and Java to do the server side. So I suppose applying the regular expression on both should be perfect.
任何人都可以给出一些提示以正则表达式编写规则的方法吗?
Can anyone give some hints how to write the rules in regular expression?
推荐答案
来自 Unicode中汉字的完整范围是多少?,CJK unicode范围是:
From What's the complete range for Chinese characters in Unicode?, the CJK unicode ranges are:
Block Range Comment
--------------------------------------- ----------- ----------------------------------------------------
CJK Unified Ideographs 4E00-9FFF Common
CJK Unified Ideographs Extension A 3400-4DBF Rare
CJK Unified Ideographs Extension B 20000-2A6DF Rare, historic
CJK Unified Ideographs Extension C 2A700–2B73F Rare, historic
CJK Unified Ideographs Extension D 2B740–2B81F Uncommon, some in current use
CJK Unified Ideographs Extension E 2B820–2CEAF Rare, historic
CJK Compatibility Ideographs F900-FAFF Duplicates, unifiable variants, corporate characters
CJK Compatibility Ideographs Supplement 2F800-2FA1F Unifiable variants
CJK Symbols and Punctuation 3000-303F
您可能想允许Unicode块 CJK统一表意文字和 CJK统一表意扩展A 中的代码点.
You probably want to allow code points from the Unicode blocks CJK Unified Ideographs and CJK Unified Ideographs Extension A.
此正则表达式将匹配这2个CJK块中的0到9个空格,表意空格(U + 3000),A-Z字母或代码点.
This regex will match 0 to 9 spaces, ideographic spaces (U+3000), A-Z letters, or code points in those 2 CJK blocks.
/^[ A-Za-z\u3000\u3400-\u4DBF\u4E00-\u9FFF]{0,9}$/
表意文字列在:
但是,您也可以添加更多块.
However, you may as well add more blocks.
function has10OrLessCJK(text) {
return /^[ A-Za-z\u3000\u3400-\u4DBF\u4E00-\u9FFF]{0,9}$/.test(text);
}
function checkValidation(value) {
var valid = document.getElementById("valid");
if (has10OrLessCJK(value)) {
valid.innerText = "Valid";
} else {
valid.innerText = "Invalid";
}
}
<input type="text"
style="width:100%"
oninput="checkValidation(this.value)"
value="你的a你的a你的a">
<div id="valid">
Valid
</div>
这篇关于如何使用正则表达式验证中文输入?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!