删除字符串中除“ñ"之外的重音符号；

本文介绍了删除字符串中除“ñ"之外的重音符号；的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下示例代码:

var inputString = "ñaáme";
inputString = inputString.Replace('ñ', '\u00F1');
var normalizedString = inputString.Normalize(NormalizationForm.FormD);
var result = Regex.Replace(normalizedString, @"[^ñÑa-zA-Z0-9\s]*", string.Empty);
return result.Replace('\u00F1', 'ñ'); // naame :(

我需要在不删除 "ñ"s 的情况下规范化文本

I need to normalize the text without removing the "ñ"s

我跟着这个例子但它适用于 Java 并且对我不起作用

I followed this example But it's for Java and it has not worked for me

我希望您的结果是:ñaame".

I want your result to be: "ñaame".

推荐答案

您可以将除特定字母 ñ 和 ASCII 字母(不需要规范化)以外的任何 Unicode 字母与 匹配(?i)[\p{L}-[ña-z]]+ 正则表达式并将其标准化.然后，从字符串中删除所有组合标记.

You may match any Unicode letter other than your specific letter ñ and ASCII letters (that do not need normalization) with (?i)[\p{L}-[ña-z]]+ regex and normalize it. Then, also remove any combining marks from the string.

使用

var inputString = "ñaáme";
var result = string.Concat(Regex.Replace(inputString, @"(?i)[\p{L}-[ña-z]]+", m =>
        m.Value.Normalize(NormalizationForm.FormD)
    )
    .Where(c => CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark));
Console.Write(result);

查看 C# 演示

模式描述

(?i) - 忽略大小写修饰符
[ - 字符类的开始
- \p{L} - 任何 Unicode 字母
- -[ - 除了
  - ña-z - ñ 和 ASCII 字母
  - (?i) - ignore case modifier
  - [ - start of a character class
    - \p{L} - any Unicode letter
    - -[ - other than
      - ña-z - ñ and ASCII letters
      这篇关于删除字符串中除“ñ"之外的重音符号；的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！