我假设之间可能存在效率差异:

if (index($string, "abc") < -1) {}




if ($string !~ /abc/) {}


有人可以基于Perl的实现方式(相对于纯基准测试)来确认是否是这种情况吗?

显然,我可以猜测两者的实现方式(基于我在C语言中的书写方式),但更理想的情况是希望基于实际的perl源代码提供更明智的答案。



这是我自己的示例基准:

                          Rate regex.FIND_AT_END    index.FIND_AT_END
regex.FIND_AT_END     639345/s                   --                 -88%
index.FIND_AT_END    5291005/s                 728%                   --
                          Rate regex.NOFIND         index.NOFIND
regex.NOFIND          685260/s                   --                 -88%
index.NOFIND         5515720/s                 705%                   --
                          Rate regex.FIND_AT_START  index.FIND_AT_START
regex.FIND_AT_START   672269/s                   --                 -90%
index.FIND_AT_START  7032349/s                 946%                   --
##############################
use Benchmark qw(:all);

my $count = 10000000;
my $re = qr/abc/o;
my %tests = (
    "NOFIND        " => "cvxcvidgds.sdfpkisd[s"
   ,"FIND_AT_END   " => "cvxcvidgds.sdfpabcd[s"
   ,"FIND_AT_START " => "abccvidgds.sdfpkisd[s"
);

foreach my $type (keys %tests) {
    my $str = $tests{$type};
    cmpthese($count, {
        "index.$type" => sub { my $idx = index($str, "abc"); },
        "regex.$type" => sub { my $idx = ($str =~ $re); }
    });
}

最佳答案

看一下函数Perl_instr

 430 char *
 431 Perl_instr(register const char *big, register const char *little)
 432 {
 433     register I32 first;
 434
 435     PERL_ARGS_ASSERT_INSTR;
 436
 437     if (!little)
 438         return (char*)big;
 439     first = *little++;
 440     if (!first)
 441         return (char*)big;
 442     while (*big) {
 443         register const char *s, *x;
 444         if (*big++ != first)
 445             continue;
 446         for (x=big,s=little; *s; /**/ ) {
 447             if (!*x)
 448                 return NULL;
 449             if (*s != *x)
 450                 break;
 451             else {
 452                 s++;
 453                 x++;
 454             }
 455         }
 456         if (!*s)
 457             return (char*)(big-1);
 458     }
 459     return NULL;
 460 }


S_regmatch进行比较。在我看来,与regmatch相比,index中有一些开销;-)

08-26 08:55