问题描述
Recentrly我发现StringUtils库中非常有用的方法是
Recentrly I found very helpful method in StringUtils library which is
StringUtils.stripAccents(String s)
我发现删除任何特殊字符并将其转换为某些ASCII等效非常有用,因为instaceç= c等。
I found it really helpful with removing any special characters and converting it to some ASCII "equivalent", for instace ç=c etc.
现在我正在为一位真正需要做这类事情的德国客户工作,但仅限于非德国人。任何变音都应保持不变。我意识到strinAccents在这种情况下不会有用。
Now I am working for a German customer who really needs to do such a thing but only for non-German characters. Any umlauts should stay untouched. I realised that strinAccents won't be useful in that case.
有没有人有这方面的经验?
是否有任何有用的工具/库/类或正则表达式?
我试着编写一些解析和替换这些字符的类,但是为所有语言构建这样的地图可能非常困难......
Does anyone has some experience around that stuff?Are there any useful tools/libraries/classes or maybe regular expressions?I tried to write some class which is parsing and replacing such characters but it can be very difficult to build such map for all languages...
任何建议appriciated ...
Any suggestions appriciated...
推荐答案
最好建立一个自定义功能。它可能如下所示。如果您想避免转换字符,可以删除两个字符串(常量)之间的关系。
Best built a custom function. It can be like the following. If you want to avoid the conversion of a character, you can remove the relationship between the two strings (the constants).
private static final String UNICODE =
"ÀàÈèÌìÒòÙùÁáÉéÍíÓóÚúÝýÂâÊêÎîÔôÛûŶŷÃãÕõÑñÄäËëÏïÖöÜüŸÿÅåÇçŐőŰű";
private static final String PLAIN_ASCII =
"AaEeIiOoUuAaEeIiOoUuYyAaEeIiOoUuYyAaOoNnAaEeIiOoUuYyAaCcOoUu";
public static String toAsciiString(String str) {
if (str == null) {
return null;
}
StringBuilder sb = new StringBuilder();
for (int index = 0; index < str.length(); index++) {
char c = str.charAt(index);
int pos = UNICODE.indexOf(c);
if (pos > -1)
sb.append(PLAIN_ASCII.charAt(pos));
else {
sb.append(c);
}
}
return sb.toString();
}
public static void main(String[] args) {
System.out.println(toAsciiString("Höchstalemannisch"));
}
这篇关于从String中删除重音符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!