本文介绍了有没有办法摆脱重音并将整个字符串转换为常规字母?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

除了使用 String.replaceAll() 方法并一个一个替换字母之外,是否有更好的方法来摆脱重音并使这些字母规则?示例:

Is there a better way for getting rid of accents and making those letters regular apart from using String.replaceAll() method and replacing letters one by one?Example:

输入:orčpžsíáýd

输出:orcpzsiayd

它不需要包含所有带有重音符号的字母,如俄语字母或中文字母.

It doesn't need to include all letters with accents like the Russian alphabet or the Chinese one.

推荐答案

使用 java.text.Normalizer 为您处理此问题.

string = Normalizer.normalize(string, Normalizer.Form.NFD);
// or Normalizer.Form.NFKD for a more "compatible" deconstruction

这会将所有重音符号与字符分开.然后,您只需要将每个字符与字母进行比较,然后将不是的扔掉.

This will separate all of the accent marks from the characters. Then, you just need to compare each character against being a letter and throw out the ones that aren't.

string = string.replaceAll("[^\\p{ASCII}]", "");

如果你的文本是 unicode,你应该使用它:

If your text is in unicode, you should use this instead:

string = string.replaceAll("\\p{M}", "");

对于 unicode,\\P{M} 匹配基本字形,\\p{M}(小写)匹配每个重音符号.

For unicode, \\P{M} matches the base glyph and \\p{M} (lowercase) matches each accent.

感谢 GarretWilson 提供指针,感谢 regular-expressions.info 提供出色的 unicode指南.

Thanks to GarretWilson for the pointer and regular-expressions.info for the great unicode guide.

这篇关于有没有办法摆脱重音并将整个字符串转换为常规字母?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 02:44
查看更多