问题描述
我列出了一些比利时城市的变音字符:(Liège,Quiévrain,Franière等)我希望将这些特殊字符转换为与包含大写相同名称的列表进行比较,但没有变音符号(LIEGE,QUIEVRAIN,FRANIERE)
I have a list with some Belgian cities with diacritic characters: (Liège, Quiévrain, Franière, etc.) and I would like to transform these special characters to compare with a list containing the same names in upper case, but without the diacritical marks (LIEGE, QUIEVRAIN, FRANIERE)
我首先尝试的是使用大写字母:
What i first tried to do was to use the upper case:
LIEGE.contentEqual(Liège.toUpperCase())
但这不合适,因为Liège
的大写是LIÈGE
而不是 LIEGE
。
LIEGE.contentEqual(Liège.toUpperCase())
but that doesn't fit because the Upper case of Liège
is LIÈGE
and not LIEGE
.
我有一些复杂的想法,比如替换每个角色,但这听起来很愚蠢而且过程很长。
I have some complicated ideas like replacing each character, but that sounds stupid and a long process.
关于如何以聪明的方式做到这一点的任何想法?
Any ideas on how to do this in a smart way?
推荐答案
在Java中查看此方法
Check out this method in Java
private static final String PLAIN_ASCII = "AaEeIiOoUu" // grave
+ "AaEeIiOoUuYy" // acute
+ "AaEeIiOoUuYy" // circumflex
+ "AaOoNn" // tilde
+ "AaEeIiOoUuYy" // umlaut
+ "Aa" // ring
+ "Cc" // cedilla
+ "OoUu" // double acute
;
private static final String UNICODE = "\u00C0\u00E0\u00C8\u00E8\u00CC\u00EC\u00D2\u00F2\u00D9\u00F9"
+ "\u00C1\u00E1\u00C9\u00E9\u00CD\u00ED\u00D3\u00F3\u00DA\u00FA\u00DD\u00FD"
+ "\u00C2\u00E2\u00CA\u00EA\u00CE\u00EE\u00D4\u00F4\u00DB\u00FB\u0176\u0177"
+ "\u00C3\u00E3\u00D5\u00F5\u00D1\u00F1"
+ "\u00C4\u00E4\u00CB\u00EB\u00CF\u00EF\u00D6\u00F6\u00DC\u00FC\u0178\u00FF"
+ "\u00C5\u00E5" + "\u00C7\u00E7" + "\u0150\u0151\u0170\u0171";
/**
* remove accented from a string and replace with ascii equivalent
*/
public static String removeAccents(String s) {
if (s == null)
return null;
StringBuilder sb = new StringBuilder(s.length());
int n = s.length();
int pos = -1;
char c;
boolean found = false;
for (int i = 0; i < n; i++) {
pos = -1;
c = s.charAt(i);
pos = (c <= 126) ? -1 : UNICODE.indexOf(c);
if (pos > -1) {
found = true;
sb.append(PLAIN_ASCII.charAt(pos));
} else {
sb.append(c);
}
}
if (!found) {
return s;
} else {
return sb.toString();
}
}
这篇关于比较具有特殊字符的单词时忽略变音字符(é,è,...)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!