本文介绍了从字符串中删除非ascii字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,


我相信默认情况下.net中的所有字符串都是unicode,我正在寻找一种

方法来删除所有非ascii字符从一个字符串(或者可选择

替换它们)。


有一篇关于代码项目的文章看起来像是什么样的

我想要的但是我不能不认为它比它更复杂

需要。


我看了msdn关于编码的页面,但我不是很熟悉这个主题。


如果我能得到一个ascii字符列表那么它应该很容易写

a方法,根据列表检查每个字符并执行替换

或必要时删除操作。然而,我无法找到任何确切的

这样可信赖的老谷歌,有什么我想念的吗?。


如果它有助于我的原因我需要这是因为我正在为跛脚命令行mp3编码器写一个前端

,它不喜欢被传递,或者

要求输出到,文件包含unicode字符的路径。


-

Eps

Hi there,

I believe all strings in .net are unicode by default, I am looking for a
way to remove all non ascii characters from a string (or optionally
replace them).

There is an article on code project which kind of looks like it does
what I want but I can''t help thinking it makes it more complex than it
needs to be.

I have looked at the msdn pages to do with Encodings but I am not very
familiar with this topic.

If I can get a list of ascii characters then it should be easy to write
a method that checks each char against the list and performs the replace
or remove operation if necessary. Yet I can''t find anything exactly
like this with trusty old google, is there something I am missing ?.

If it helps the reason I need this is because I am writing a front end
for the lame command line mp3 encoder, it doesn''t like being passed, or
asked to output to, file paths containing unicode characters.

--
Eps

推荐答案



也许我错过了这段代码: -


byte [] asciiChars =编码.ASCII.GetBytes(" AB£CD");

string result = Encoding.ASCII.GetString(asciiChars);

Console.WriteLine(result);


创建字符串: -


AB? CD


-

Anthony Jones - MVP ASP / ASP.NET


Perhaps I''m missing something this code:-

byte[] asciiChars = Encoding.ASCII.GetBytes("AB £ CD");
string result = Encoding.ASCII.GetString(asciiChars);
Console.WriteLine(result);

creates the string:-

AB ? CD

--
Anthony Jones - MVP ASP/ASP.NET




之前我见过这段代码,有谁能解释为什么

Encoding.ASCII.GetString()方法不接受字符串作为参数?


-

Eps

I have seen this code before, can anyone explain why the
Encoding.ASCII.GetString() method does not accept a string as a parameter ?.

--
Eps




之前我见过这段代码,有谁能解释为什么

Encoding.ASCII.GetString()方法不接受字符串作为参数?


I have seen this code before, can anyone explain why the
Encoding.ASCII.GetString() method does not accept a string as a parameter?.



因为编码类以指定编码对字节数组进行CLR字符串编码和解码(通常是
用于序列化或互操作目的。没有这样的事情可以用非符号的
Unicode System.String(好吧,你可以将字符串视为普通数组

的char,但任何.NET函数仍会将字符串视为UTF-16。


您所要求的仍然是可能的,因为ASCII是

Unicode的纯子集。使用LINQ,你可以使用这个单行:


string ascii = new string(s.Where(c =(int)c> = 0&&(int) c< =

127)。ToArray());


但请注意ascii仍然是一个Unicode字符串 - 它只是

不包含任何非ASCII字符。

Because Encoding classes encode and decode CLR strings (which are
_always_ Unicode) to/from byte arrays in specified encoding, typically
for serialization or interop purposes. There''s no such thing as a non-
Unicode System.String (well, you could treat a string as a plain array
of char, but any .NET function will still treat string as UTF-16).

What you ask is still possible, because ASCII is a pure subset of
Unicode. With LINQ, you could use this one-liner:

string ascii = new string(s.Where(c =(int)c >= 0 && (int)c <=
127).ToArray());

Note however that "ascii" would still be a Unicode string - it just
wouldn''t contain any non-ASCII characters.


这篇关于从字符串中删除非ascii字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 18:35
查看更多