本文介绍了将字符串拆分为字母数组-双字符字母PHP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将字符串拆分为字母数组。问题是在我的语言(克罗地亚语)中也存在双字符字母(例如lj,nj和dž)。

I need to split a string into an array of letters. The problem is that in my language (Croatian) there are double character letters aswell (e.g. lj, nj, dž).

所以字符串如ljubičicajecvijet应该分成一个看起来像这样的数组:

So the string such as ljubičicajecvijet should be split into an array that would look like this:

Array
(
    [0] => lj
    [1] => u
    [2] => b
    [3] => i
    [4] => č
    [5] => i
    [6] => c
    [7] => a
    [8] => j
    [9] => e
    [10] => c
    [11] => v
    [12] => i
    [13] => j
    [14] => e
    [15] => t
)

这是数组中克罗地亚语字符的列表(我也包括英语字母)。

Here is the list of Croatian characters in an array (I included English letters aswell).

$alphabet= array(
            'a', 'b', 'c',
            'č', 'ć', 'd',
            'dž', 'đ', 'e',
            'f', 'g', 'h',
            'i', 'j', 'k',
            'l', 'lj', 'm',
            'n', 'nj', 'o',
            'p', 'q', 'r',
            's', 'š', 't',
            'u', 'v', 'w',
            'x', 'y', 'z', 'ž'
          );


推荐答案

您可以使用以下解决方案:

You can use this kind of solution:

数据:

$text = 'ljubičicajecviježdžt';

$alphabet = [
            'a', 'b', 'c',
            'č', 'ć', 'd',
            'dž', 'đ', 'e',
            'f', 'g', 'h',
            'i', 'j', 'k',
            'l', 'lj', 'm',
            'n', 'nj', 'o',
            'p', 'q', 'r',
            's', 'š', 't',
            'u', 'v', 'w',
            'x', 'y', 'z', 'ž'
];

1。按长度排序结果,以便在开头有双字母

1. Order results by length in order to have the double letters at the beginning

// 2 letters first
usort($alphabet, function($a, $b) {
    if( mb_strlen($a) != mb_strlen($b) )
        return mb_strlen($a) < mb_strlen($b);
    else
        return $a > $b;
});

var_dump($alphabet);

2。最后,拆分。我使用 preg_split 函数和 preg_quote 函数来保护该函数。

2. Finally, split. I used preg_split function with preg_quote to protect the function.

// split
$alphabet = array_map('preg_quote', $alphabet); // protect preg_split
$pattern = implode('|', $alphabet); // 'dž|lj|nj|a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z|ć|č|đ|š|ž'

var_dump($pattern);

var_dump( preg_split('`(' . $pattern . ')`si', $text, null, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY) );

结果:)

array (size=18)
  0 => string 'lj' (length=2)
  1 => string 'u' (length=1)
  2 => string 'b' (length=1)
  3 => string 'i' (length=1)
  4 => string 'č' (length=2)
  5 => string 'i' (length=1)
  6 => string 'c' (length=1)
  7 => string 'a' (length=1)
  8 => string 'j' (length=1)
  9 => string 'e' (length=1)
  10 => string 'c' (length=1)
  11 => string 'v' (length=1)
  12 => string 'i' (length=1)
  13 => string 'j' (length=1)
  14 => string 'e' (length=1)
  15 => string 'ž' (length=2)
  16 => string 'dž' (length=3)
  17 => string 't' (length=1)

这篇关于将字符串拆分为字母数组-双字符字母PHP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-28 04:25