本文介绍了最佳实践多语言网站的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在这个问题上苦苦挣扎了好几个月了,但是以前我从来没有需要探索所有可能的选择的情况.现在,我觉得该是时候了解各种可能性并建立自己的个人喜好了,以用于我即将进行的项目中.

I've been struggling with this question for quite some months now, but I haven't been in a situation that I needed to explore all possible options before. Right now, I feel like it's time to get to know the possibilities and create my own personal preference to use in my upcoming projects.

让我首先勾勒出我正在寻找的情况

我将要升级/重新开发已经使用了一段时间的内容管理系统.但是,我觉得多语言是对该系统的巨大改进.在我没有使用任何框架之前,我将在即将到来的项目中使用Laraval4. Laravel似乎是更干净的PHP编码方式的最佳选择. Sidenote: Laraval4 should be no factor in your answer.我正在寻找独立于平台/框架的通用翻译方式.

I'm about to upgrade/redevelop a content management system which I've been using for quite a while now. However, I'm feeling multi language is a great improvement to this system. Before I did not use any frameworks but I'm going to use Laraval4 for the upcoming project. Laravel seems the best choice of a cleaner way to code PHP. Sidenote: Laraval4 should be no factor in your answer. I'm looking for general ways of translation that are platform/framework independent.

应翻译的内容

由于我正在寻找的系统需要尽可能地方便用户使用,因此管理翻译的方法应在CMS内部.无需启动FTP连接即可修改翻译文件或任何html/php解析的模板.

As the system I am looking for needs to be as user friendly as possible the method of managing the translation should be inside the CMS. There should be no need to start up an FTP connection to modify translation files or any html/php parsed templates.

此外,我正在寻找转换多个数据库表的最简单方法,也许不需要创建其他表.

Furthermore, I'm looking for the easiest way to translate multiple database tables perhaps without the need of making additional tables.

我自己想到了什么

我一直在搜索,阅读和尝试事物.我有几个选择.但是我仍然不觉得自己已经达到了我真正追求的最佳实践方法.现在,这是我想出的,但是这种方法也有副作用.

As I've been searching, reading and trying things myself already. There are a couple of options I have. But I still don't feel like I've reached a best practice method for what I am really seeking. Right now, this is what I've come up with, but this method also has it side effects.

  1. PHP解析的模板:模板系统应由PHP解析.这样,我就可以将转换后的参数插入HTML,而无需打开模板并进行修改.除此之外,PHP解析的模板使我能够为整个网站使用1个模板,而不必为每种语言都拥有一个子文件夹(这是我以前使用过的).达到此目标的方法可以是Smarty,TemplatePower,Laravel's Blade或任何其他模板解析器.正如我所说的,这应该与书面解决方案无关.
  2. 数据库驱动:也许我不需要再次提及.但是解决方案应该是数据库驱动的. CMS旨在面向对象和MVC,因此我需要考虑字符串的逻辑数据结构.因为我的模板是结构化的:templates/Controller/View.php,所以这种结构也许最有意义:Controller.View.parameter.数据库表中的这些字段很长,且带有value字段.在模板内部,我们可以使用诸如echo __('Controller.View.welcome', array('name', 'Joshua'))之类的排序方法,并且参数包含Welcome, :name.因此结果为Welcome, Joshua.这似乎是一个好方法,因为编辑器很容易理解诸如:name之类的参数.
  3. 数据库负载低:如果在旅途中加载这些字符串,则上述系统当然会导致数据库负载.因此,我需要一个缓存系统,以便在管理环境中对语言文件进行编辑/保存后立即重新呈现它们.由于生成了文件,因此还需要一个良好的文件系统布局.我猜我们可以使用languages/en_EN/Controller/View.php或.ini,尽其所能.最终,.ini文件甚至可能被更快地解析.该数据应包含format parameter=value;中的数据.我猜这是最好的方法,因为渲染的每个View都可以包含它自己的语言文件(如果存在).然后,应将语言参数加载到特定的视图,而不是在全局范围内加载,以防止参数相互覆盖.
  4. 数据库表转换:实际上,这是我最担心的事情.我正在寻找一种创建News/Pages/etc的翻译的方法.尽快.每个模块有两个表(例如NewsNews_translations)是一个选项,但是要获得一个好的系统,需要进行很多工作.我想到的一件事是基于我编写的data versioning系统:有一个数据库表名称Translations,该表具有languagetablenameprimarykey的唯一组合.例如:en_En/News/1(参考ID为1的News项目的英语版本).但是这种方法有两个巨大的缺点:首先,该表往往在数据库中存储大量数据时会变得很长,其次,使用此设置来搜索表将是一件艰巨的工作.例如.搜索该项目的SEO子弹将是全文搜索,这真是愚蠢.但是,另一方面:这是一种非常快速地在每个表中创建可翻译内容的快速方法,但是我不认为这会增加缺点.
  5. 前端工作:前端也需要一些思考.当然,我们会将可用的语言存储在数据库中,并停用所需的语言.这样,脚本可以生成一个下拉菜单以选择一种语言,而后端可以自动决定可以使用CMS进行哪些翻译.在获取视图的语言文件或为网站上的内容项获取正确的翻译时,将使用所选的语言(例如en_EN).
  1. PHP Parsed Templates: the template system should be parsed by PHP. This way I'm able to insert the translated parameters into the HTML without having to open the templates and modify them. Besides that, PHP parsed templates gives me the ability to have 1 template for the complete website instead of having a subfolder for each language (which I've had before). The method to reach this target can be either Smarty, TemplatePower, Laravel's Blade or any other template parser. As I said this should be independent to the written solution.
  2. Database Driven: perhaps I don't need to mention this again. But the solution should be database driven. The CMS is aimed to be object oriented and MVC, so I would need to think of a logical data structure for the strings. As my templates would be structured: templates/Controller/View.php perhaps this structure would make the most sense: Controller.View.parameter. The database table would have these fields a long with a value field. Inside the templates we could use some sort method like echo __('Controller.View.welcome', array('name', 'Joshua')) and the parameter contains Welcome, :name. Thus the result being Welcome, Joshua. This seems a good way to do this, because the parameters such as :name are easy to understand by the editor.
  3. Low Database Load: Of course the above system would cause loads of database load if these strings are being loaded on the go. Therefore I would need a caching system that re-renders the language files as soon as they are edited/saved in the administration environment. Because files are generated, also a good file system layout is needed. I guess we can go with languages/en_EN/Controller/View.php or .ini, whatever suits you best. Perhaps an .ini is even parsed quicker in the end. This fould should contain the data in the format parameter=value;. I guess this is the best way of doing this, since each View that is rendered can include it's own language file if it exists. Language parameters then should be loaded to a specific view and not in a global scope to prevent parameters from overwriting each other.
  4. Database Table translation: this in fact is the thing I'm most worried about. I'm looking for a way to create translations of News/Pages/etc. as quickly as possible. Having two tables for each module (for example News and News_translations) is an option but it feels like to much work to get a good system. One of the things I came up with is based on a data versioning system I wrote: there is one database table name Translations, this table has a unique combination of language, tablename and primarykey. For instance: en_En / News / 1 (Referring to the English version of the News item with ID=1). But there are 2 huge disadvantages to this method: first of all this table tends to get pretty long with a lot of data in the database and secondly it would be a hell of a job to use this setup to search the table. E.g. searching for the SEO slug of the item would be a full text search, which is pretty dumb. But on the other hand: it's a quick way to create translatable content in every table very fast, but I don't believe this pro overweights the con's.
  5. Front-end Work: Also the front-end would need some thinking. Of course we would store the available languages in a database and (de)active the ones we need. This way the script can generate a dropdown to select a language and the back-end can decide automatically what translations can be made using the CMS. The chosen language (e.g. en_EN) would then be used when getting the language file for a view or to get the right translation for a content item on the website.

所以,它们在那里.到目前为止,我的想法.它们甚至还不包括日期等的本地化选项,但是由于我的服务器支持PHP5.3.2 +,因此最好的选择是使用国际扩展名,如此处所述:-但这将在以后的开发中使用.目前,主要问题是如何拥有网站内容翻译的最佳实践.

So, there they are. My ideas so far. They don't even include localization options for dates etc yet, but as my server supports PHP5.3.2+ the best option is to use the intl extension as explained here: http://devzone.zend.com/1500/internationalization-in-php-53/ - but this would be of use in any later stadium of development. For now the main issue is how to have the best practics of translation of the content in a website.

除了我在这里解释的所有内容外,我还有另一件事尚未决定,它看起来像一个简单的问题,但实际上,这让我头疼:

Besides everything I explained here, I still have another thing which I haven't decided yet, it looks like a simple question, but in fact it's been giving me headaches:

URL翻译?我们应该这样做吗?并以什么方式?

所以..如果我有此URL:http://www.domain.com/about-us并且英语是我的默认语言.当我选择荷兰语作为语言时,是否应该将该URL转换为http://www.domain.com/over-ons?还是我们应该走简单的道路,简单地更改在/about处可见的页面的内容.最后一件事似乎不是一个有效的选择,因为这将生成同一URL的多个版本,对内容进行索引将以正确的方式失败.

So.. if I have this url: http://www.domain.com/about-us and English is my default language. Should this URL be translated into http://www.domain.com/over-ons when I choose Dutch as my language? Or should we go the easy road and simply change the content of the page visible at /about. The last thing doesn't seem a valid option because that would generate multiple versions of the same URL, this indexing the content will fail the right way.

另一个选择是使用http://www.domain.com/nl/about-us.这将为每个内容至少生成一个唯一的URL.同样,使用另一种语言(例如http://www.domain.com/en/about-us)更容易,并且提供给Google和人类访问者的URL更容易理解.使用此选项,我们如何处理默认语言?默认语言是否应删除默认选择的语言?因此,将http://www.domain.com/en/about-us重定向到http://www.domain.com/about-us ...在我眼中,这是最好的解决方案,因为当仅针对一种语言设置CMS时,就不需要在URL中使用此语言标识.

Another option is using http://www.domain.com/nl/about-us instead. This generates at least a unique URL for each content. Also this would be easier to go to another language, for example http://www.domain.com/en/about-us and the URL provided is easier to understand for both Google and Human visitors. Using this option, what do we do with the default languages? Should the default language remove the language selected by default? So redirecting http://www.domain.com/en/about-us to http://www.domain.com/about-us ... In my eyes this is the best solution, because when the CMS is setup for only one language there is no need to have this language identification in the URL.

第三个选项是两个选项的组合:主要语言使用无语言标识" -URL(http://www.domain.com/about-us).并使用带有翻译后的SEO代码的URL作为子语言:http://www.domain.com/nl/over-ons& http://www.domain.com/de/uber-uns

And a third option is a combination from both options: using the "language-identification-less"-URL (http://www.domain.com/about-us) for the main language. And use an URL with a translated SEO slug for sublanguages: http://www.domain.com/nl/over-ons & http://www.domain.com/de/uber-uns

我希望我的问题能引起您的注意,他们肯定会打断我的!它确实帮助我解决了这里的问题.让我有可能回顾一下我以前使用的方法以及我即将推出的CMS的想法.

I hope my question gets your heads cracking, they cracked mine for sure! It did help me already to work things out as a question here. Gave me a possibility to review the methods I've used before and the idea's I'm having for my upcoming CMS.

我已经非常感谢您抽出宝贵的时间阅读这段文本!

I would like to thank you already for taking the time to read this bunch of text!

// Edit #1:

我忘了提及:__()函数是翻译给定字符串的别名.在此方法中,显然应该有某种后备方法,当尚无翻译可用时,将加载默认文本.如果缺少翻译,则应将其插入或重新生成翻译文件.

I forgot to mention: the __() function is an alias to translate a given string. Within this method there obviously should be some sort of fallback method where the default text is loaded when there are not translations available yet. If the translation is missing it should either be inserted or the translation file should be regenerated.

推荐答案

主题的前提

在多语言站点中有三个不同的方面:

Topic's premise

There are three distinct aspects in a multilingual site:

  • 界面翻译
  • 内容
  • 网址路由

尽管它们都以不同的方式互连,但从CMS的角度来看,它们是使用不同的UI元素进行管理的,并且存储方式也不同.您似乎对自己的实现和对前两个的理解充满信心.问题是关于后一个方面的问题-"URL翻译?我们应该这样做还是不这样做?用什么方式?"

While they all interconnected in different ways, from CMS point of view they are managed using different UI elements and stored differently. You seem to be confident in your implementation and understanding of the first two. The question was about the latter aspect - "URL Translation? Should we do this or not? and in what way?"

一个非常重要的事情是,不要看上 IDN .取而代之的是音译(也称为:转录和罗马化).乍一看,IDN对于国际URL似乎是可行的选择,但实际上,它不能按广告宣传的方式工作,原因有两个:

A very important thing is, don't get fancy with IDN. Instead favor transliteration (also: transcription and romanization). While at first glance IDN seems viable option for international URLs, it actually does not work as advertised for two reasons:

  • 某些浏览器会将'ч''ž'之类的非ASCII字符转换为'%D1%87''%C5%BE'
  • 如果用户具有自定义主题,则主题的字体很可能没有这些字母的符号
  • some browsers will turn the non-ASCII chars like 'ч' or 'ž' into '%D1%87' and '%C5%BE'
  • if user has custom themes, the theme's font is very likely to not have symbols for those letters

几年前,我实际上在一个基于Yii的项目(可怕的框架,恕我直言)中尝试采用IDN方法.在抓取该解决方案之前,我遇到了上述两个问题.另外,我怀疑这可能是攻击媒介.

I actually tried to IDN approach few years ago in a Yii based project (horrible framework, IMHO). I encountered both of the above mentioned problems before scraping that solution. Also, I suspect that it might be an attack vector.

基本上,您有两种选择,可以将其抽象为:

Basically you have two choices, that could be abstracted as:

  • http://site.tld/[:query]:其中[:query]决定语言和内容选择

  • http://site.tld/[:query]: where [:query] determines both language and content choice

http://site.tld/[:language]/[:query]:其中URL的[:language]部分定义了语言选择,而[:query]仅用于标识内容

http://site.tld/[:language]/[:query]: where [:language] part of URL defines the choice of language and [:query] is used only to identify the content

假设您选择http://site.tld/[:query].

在这种情况下,您有一种主要的语言来源:[:query]段的内容;以及另外两个来源:

In that case you have one primary source of language: the content of [:query] segment; and two additional sources:

    该特定浏览器的
  • $_COOKIE['lang']
  • HTTP接受语言中的语言列表标头
  • value $_COOKIE['lang'] for that particular browser
  • list of languages in HTTP Accept-Language header

首先,您需要将查询与定义的路由模式之一匹配(如果您选择的是Laravel,则在此处阅读) .成功匹配模式后,您需要查找语言.

First, you need to match the query to one of defined routing patterns (if your pick is Laravel, then read here). On successful match of pattern you then need to find the language.

您将必须遍历模式的所有部分.找到所有这些片段的潜在翻译,并确定使用哪种语言.当出现冲突时(不是如果"),将使用另外两个源(cookie和标头)来解决路由冲突.

You would have to go through all the segments of the pattern. Find the potential translations for all of those segments and determine which language was used. The two additional sources (cookie and header) would be used to resolve routing conflicts, when (not "if") they arise.

例如:http://site.tld/blog/novinka.

这是"блог, новинка"的音译,在英语中大约等于"blog", "latest".

That's transliteration of "блог, новинка", that in English means approximately "blog", "latest".

您已经注意到,俄语中的блог"将音译为博客".这意味着对于[:query]的第一部分(在最佳情况下),您最终会得到['en', 'ru']可能的语言列表.然后,您进入下一个片段-"novinka".可能的列表中可能只有一种语言:['ru'].

As you can already notice, in Russian "блог" will be transliterated as "blog". Which means that for the first part of [:query] you (in the best case scenario) will end up with ['en', 'ru'] list of possible languages. Then you take next segment - "novinka". That might have only one language on the list of possibilities: ['ru'].

当列表中有一项时,您已经成功找到了该语言.

When the list has one item, you have successfully found the language.

但是,如果您最终得到2种(例如:俄语和乌克兰语)或更多种可能性..或0种可能性(视具体情况而定).您必须使用Cookie和/或标题才能找到正确的选项.

But if you end up with 2 (example: Russian and Ukrainian) or more possibilities .. or 0 possibilities, as a case might be. You will have to use cookie and/or header to find the correct option.

如果其他所有方法均失败,则选择网站的默认语言.

And if all else fails, you pick the site's default language.

替代方法是使用URL,可以将其定义为http://site.tld/[:language]/[:query].在这种情况下,翻译查询时,您无需猜测语言,因为此时您已经知道要使用哪种语言.

The alternative is to use URL, that can be defined as http://site.tld/[:language]/[:query]. In this case, when translating query, you do not need to guess the language, because at that point you already know which to use.

还有第二种语言来源:cookie值.但是,这里没有必要弄乱Accept-Language标头,因为在冷启动"(用户首次使用自定义查询打开网站时)的情况下,您不会处理未知数量的可能的语言.

There is also a secondary source of language: the cookie value. But here there is no point in messing with Accept-Language header, because you are not dealing with unknown amount of possible languages in case of "cold start" (when user first time opens site with custom query).

相反,您有3个简单的优先选项:

Instead you have 3 simple, prioritized options:

  1. 如果设置了[:language]段,请使用它
  2. 如果已设置$_COOKIE['lang'],请使用它
  3. 使用默认语言
  1. if [:language] segment is set, use it
  2. if $_COOKIE['lang'] is set, use it
  3. use default language

使用该语言时,您只需尝试翻译查询,如果翻译失败,请为该特定段使用默认值"(基于路由结果).

When you have the language, you simply attempt to translate the query, and if translation fails, use the "default value" for that particular segment (based on routing results).

是的,从技术上讲,您可以将两种方法结合使用,但这会使过程复杂化,并且只允许那些希望将http://site.tld/en/news的URL手动更改为http://site.tld/de/news并希望新闻页面更改为德语的人.

Yes, technically you can combine both approaches, but that would complicate the process and only accommodate people who want to manually change URL of http://site.tld/en/news to http://site.tld/de/news and expect the news page to change to German.

但是即使使用cookie值(其中包含有关先前选择的语言的信息)来减轻这种情况,也可以减少魔术和希望.

But even this case could probable be mitigated using cookie value (which would contain information about previous choice of language), to implement with less magic and hope.

您可能已经猜到了,我建议使用http://site.tld/[:language]/[:query]作为更明智的选择.

As you might already guessed, I would recommend http://site.tld/[:language]/[:query] as the more sensible option.

在真实情况下,URL中也会有第三大部分:标题".就像在线商店中的产品名称或新闻站点中的文章标题一样.

Also in real word situation you would have 3rd major part in URL: "title". As in name of the product in online shop or headline of article in news site.

示例:http://site.tld/en/news/article/121415/EU-as-global-reserve-currency

在这种情况下,'/news/article/121415'是查询,而'EU-as-global-reserve-currency'是标题.纯粹用于SEO.

In this case '/news/article/121415' would be the query, and the 'EU-as-global-reserve-currency' is title. Purely for SEO purposes.

Kinda,但默认情况下不是.

Kinda, but not by default.

我不太熟悉它,但是据我所知,Laravel使用简单的基于模式的路由机制.要实现多语言URL,您可能必须扩展核心类,因为多语言路由需要访问不同形式的存储(数据库) ,缓存和/或配置文件).

I am not too familiar with it, but from what I have seen, Laravel uses simple pattern-based routing mechanism. To implement multilingual URLs you will probably have to extend core class(es), because multilingual routing need access to different forms of storage (database, cache and/or configuration files).

作为结果,您最终将获得两条有价值的信息:当前语言和查询的翻译段.然后,这些值可用于调度将产生结果的类.

As a result of all you would end up with two valuable pieces of information: current language and translated segments of query. These values then can be used to dispatch to the class(es) which will produce the result.

基本上,以下URL:http://site.tld/ru/blog/novinka(或没有'/ru'的版本)变成了

Basically, the following URL: http://site.tld/ru/blog/novinka (or the version without '/ru') gets turned into something like

$parameters = [
   'language' => 'ru',
   'classname' => 'blog',
   'method' => 'latest',
];

您仅用于调度:

$instance = new {$parameter['classname']};
$instance->{'get'.$parameters['method']}( $parameters );

..或它的某些变体,具体取决于具体的实现方式.

.. or some variation of it, depending on the particular implementation.

这篇关于最佳实践多语言网站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-06 00:40