中替换带重音的拉丁字符

中替换带重音的拉丁字符

本文介绍了如何在 Ruby 中替换带重音的拉丁字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 ActiveRecord 模型,Foo,它有一个 name 字段.我希望用户能够按名称搜索,但我希望搜索忽略大小写和任何重音.因此,我还存储了一个用于搜索的 canonical_name 字段:

I have an ActiveRecord model, Foo, which has a name field. I'd like users to be able to search by name, but I'd like the search to ignore case and any accents. Thus, I'm also storing a canonical_name field against which to search:

class Foo
  validates_presence_of :name

  before_validate :set_canonical_name

  private

  def set_canonical_name
    self.canonical_name ||= canonicalize(self.name) if self.name
  end

  def canonicalize(x)
    x.downcase.  # something here
  end
end

我需要填写这里的东西"来替换重音字符.有什么比

I need to fill in the "something here" to replace the accented characters. Is there anything better than

x.downcase.gsub(/[àáâãäå]/,'a').gsub(/æ/,'ae').gsub(/ç/, 'c').gsub(/[èéêë]/,'e')....

而且,就此而言,由于我使用的不是 Ruby 1.9,因此我无法将这些 Unicode 文字放入我的代码中.实际的正则表达式会看起来更丑.

And, for that matter, since I'm not on Ruby 1.9, I can't put those Unicode literals in my code. The actual regular expressions will look much uglier.

推荐答案

Rails 已经有一个用于规范化的内置函数,您只需要使用它来规范化您的字符串以形成 KD,然后删除其他字符(即重音符号),例如这个:

Rails has already a builtin for normalizing, you just have to use this to normalize your string to form KD and then remove the other chars (i.e. accent marks) like this:

>> "àáâãäå".mb_chars.normalize(:kd).gsub(/[^x00-x7F]/n,'').downcase.to_s
=> "aaaaaa"

这篇关于如何在 Ruby 中替换带重音的拉丁字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-20 23:27