本文介绍了测试php中给定字符的大写或小写类型的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不管当前的本地语言是什么,检测字符是大写还是小写的理想方法是什么.

还有更直接的功能吗?

假设:将内部字符编码设置为UTF-8&本地浏览器会话为en-US,en; q = 0.5&已经安装了多字节字符串扩展名.不要使用ctype_lower或ctype_upper.

请参见下面应为多字节兼容的测试代码.

$encodingtype = 'utf8';
$charactervalue = mb_ord($character, $encodingtype);

$characterlowercase = mb_strtolower($character, $encodingtype) ;
$characterlowercasevalue = mb_ord(mb_strtolower($character, $encodingtype));

$characteruppercase = mb_strtoupper($character, $encodingtype);
$characteruppercasevalue = mb_ord(mb_strtoupper($character, $encodingtype));



// Diag Info
echo 'Input: ' . $character . "<br />";
echo 'Input Value: ' . $charactervalue = mb_ord($character, $encodingtype) . "<br />" . "<br />";
echo 'Lowercase: ' . $characterlowercase = mb_strtolower($character, $encodingtype) . "<br />";
echo 'Lowercase Value: ' . $characterlowercasevalue = mb_ord(mb_strtolower($character, $encodingtype)) . "<br />" . "<br />";
echo 'Uppercase: ' . $characteruppercase = mb_strtoupper($character, $encodingtype) . "<br />";
echo 'Uppercase Value: ' . $characteruppercasevalue = mb_ord(mb_strtoupper($character, $encodingtype)) . "<br />" . "<br />";
// Diag Info


if ($charactervalue == $characterlowercasevalue and $charactervalue != $characteruppercasevalue){
    $uppercase = 0;
    $lowercase = 1;
    echo 'Is character is lowercase' . "<br />" . "<br />";
}

elseif ($charactervalue == $characteruppercasevalue and $charactervalue != $characterlowercasevalue ){
    $uppercase = 1;
    $lowercase = 0;
    echo 'Character is uppercase' . "<br />" . "<br />";
}

else{
    $uppercase = 0;
    $lowercase = 0;
    echo 'Character is neither lowercase or uppercase' . "<br />" . "<br />";
}

  • //测试1 A//输出->字符为大写
  • //Test 2 z//输出->字符为小写
  • //测试3 +//输出->字符为小写
  • //测试4 0//输出->字符既不小写也不大写
  • //测试5ǻ//带有小环且尖锐的拉丁小写字母A//输出->字符为小写
  • //测试6Ͱ希腊大写字母HETA////输出->字符为大写
  • //测试7''NULL//输出->字符既不是小写也不是大写

解决方案

我觉得最直接的方法是编写一个正则表达式模式来确定字符类型.

在以下代码段中,我将在第一个捕获组中搜索大写字母(包括unicode),或者在第二个捕获组中搜索小写字母.如果该模式不匹配,则该字符不是字母.

有关正则表达式中的unicode字母的很好参考: https://regular-expressions.mobi/unicode. html

写两个由管道分隔的捕获组意味着每种类型的字母都将放入输出数组中的不同索引元素中. [0]是全字符串匹配(在这种情况下,从不使用,但不可避免的是生成). [1]将保留大写匹配(如果存在小写匹配,则为空-作为占位元素). [2]将保留小写字母匹配-仅在存在小写字母匹配时生成.

由于这个原因,我们可以假设matchs数组中的最高键将决定字母的大小写.

如果输入字符为非字母,则preg_match()将返回0的假结果以表示匹配数,当发生这种情况时0与查找一起使用以访问neither./p>

代码:(演示)(模式演示)

$lookup = ['neither', 'upper', 'lower'];
$tests = ['A', 'z', '+', '0', 'ǻ', 'Ͱ', null];

foreach ($tests as $test) {
    $index = preg_match('~(\p{Lu})|(\p{Ll})~u', $test, $out) ? array_key_last($out) : 0;
    echo "{$test}: {$lookup[$index]}\n";
}

输出:

A: upper
z: lower
+: neither
0: neither
ǻ: lower
Ͱ: upper
: neither

对于尚未使用php7.3的任何人,您可以像这样调用 end()然后调用key() :

代码:(演示)

foreach ($tests as $test) {
    if (preg_match('~(\p{Lu})|(\p{Ll})~u', $test, $out)) {
        end($out); // advance pointer to final element
        $index = key($out);
    } else {
        $index = 0;
    }
    echo "{$test}: {$lookup[$index]}\n";
}

我的第一种方法是每个测试最少调用一个函数,最多两次调用.通过在$lookup[]内编写preg_调用,可以将我的解决方案变成单一格式,但是我的目标是提高可读性.


p.s.这是我梦up以求的另一种变化.区别在于preg_match()总是由于最后一个空的替代"(空分支)而进行匹配.

foreach ($tests as $test) {
    preg_match('~(\p{Lu})|(\p{Ll})|~u', $test, $out);
    echo "\n{$test}: " , $lookup[sizeof($out) - 1];
}

What is an ideal way to detected if a character is uppercase or lowercase, regardless of the fact of the current local language.

Is there a more direct function?

Assumptions: Set internal character encoding to UTF-8 & Local browser session is en-US,en;q=0.5 & Have installed Multibyte String extension. Do not use ctype_lower, or ctype_upper.

See below test code that should be multibyte compatible.

$encodingtype = 'utf8';
$charactervalue = mb_ord($character, $encodingtype);

$characterlowercase = mb_strtolower($character, $encodingtype) ;
$characterlowercasevalue = mb_ord(mb_strtolower($character, $encodingtype));

$characteruppercase = mb_strtoupper($character, $encodingtype);
$characteruppercasevalue = mb_ord(mb_strtoupper($character, $encodingtype));



// Diag Info
echo 'Input: ' . $character . "<br />";
echo 'Input Value: ' . $charactervalue = mb_ord($character, $encodingtype) . "<br />" . "<br />";
echo 'Lowercase: ' . $characterlowercase = mb_strtolower($character, $encodingtype) . "<br />";
echo 'Lowercase Value: ' . $characterlowercasevalue = mb_ord(mb_strtolower($character, $encodingtype)) . "<br />" . "<br />";
echo 'Uppercase: ' . $characteruppercase = mb_strtoupper($character, $encodingtype) . "<br />";
echo 'Uppercase Value: ' . $characteruppercasevalue = mb_ord(mb_strtoupper($character, $encodingtype)) . "<br />" . "<br />";
// Diag Info


if ($charactervalue == $characterlowercasevalue and $charactervalue != $characteruppercasevalue){
    $uppercase = 0;
    $lowercase = 1;
    echo 'Is character is lowercase' . "<br />" . "<br />";
}

elseif ($charactervalue == $characteruppercasevalue and $charactervalue != $characterlowercasevalue ){
    $uppercase = 1;
    $lowercase = 0;
    echo 'Character is uppercase' . "<br />" . "<br />";
}

else{
    $uppercase = 0;
    $lowercase = 0;
    echo 'Character is neither lowercase or uppercase' . "<br />" . "<br />";
}

  • // Test 1 A // Output-> Character is uppercase
  • // Test 2 z // Output-> Character is lowercase
  • // Test 3 + // Output-> Character is lowercase
  • // Test 4 0 // Output-> Character is neither lowercase or uppercase
  • // Test 5 ǻ // LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE // Output-> Character is lowercase
  • // Test 6 Ͱ GREEK CAPITAL LETTER HETA // Output-> Character is uppercase
  • // Test 7 '' NULL // Output-> Character is neither lowercase or uppercase

解决方案

I feel the most direct way would be to write a regex pattern to determine the character type.

In the following snippet, I'll search for uppercase letters (including unicode) in the first capture group, or lowercase letters in the second capture group. If the pattern makes no match, the character is not a letter.

A good reference for unicode letters in regex: https://regular-expressions.mobi/unicode.html

Writing two capture groups separated by a pipe means each type of letter will be slotted into a different indexed element in the output array. [0] is the fullstring match (never used in this case, but its generation is unavoidable). [1] will hold the uppercase match (or be empty when there is a lowercase match -- as a placeholding element). [2] will hold the lowercase match -- it will only be generated if there is a lowercase match.

For this reason, we can assume the highest key in the matches array will determine the casing of the letter.

If the input character is a non-letter, preg_match() will return the falsey result of 0 to represent the number of matches, when this happens 0 is used with the lookup to access neither.

Code: (Demo) (Pattern Demo)

$lookup = ['neither', 'upper', 'lower'];
$tests = ['A', 'z', '+', '0', 'ǻ', 'Ͱ', null];

foreach ($tests as $test) {
    $index = preg_match('~(\p{Lu})|(\p{Ll})~u', $test, $out) ? array_key_last($out) : 0;
    echo "{$test}: {$lookup[$index]}\n";
}

Output:

A: upper
z: lower
+: neither
0: neither
ǻ: lower
Ͱ: upper
: neither

For anyone who is not yet on php7.3, you can call end() then key() like this:

Code: (Demo)

foreach ($tests as $test) {
    if (preg_match('~(\p{Lu})|(\p{Ll})~u', $test, $out)) {
        end($out); // advance pointer to final element
        $index = key($out);
    } else {
        $index = 0;
    }
    echo "{$test}: {$lookup[$index]}\n";
}

My first approach makes a minimum of one function call per test, and a maximum of two calls. My solution can be made into a one-liner by writing the preg_ call inside of $lookup[ and ], but I'm aiming for readability.


p.s. Here is another variation that I dreamed up. The difference is that preg_match() always makes a match because of the final empty "alternative" (empty branch).

foreach ($tests as $test) {
    preg_match('~(\p{Lu})|(\p{Ll})|~u', $test, $out);
    echo "\n{$test}: " , $lookup[sizeof($out) - 1];
}

这篇关于测试php中给定字符的大写或小写类型的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 01:08