为什么带有subeins else比带有subzwei elsif慢?

#!/usr/bin/env perl
use warnings;
use 5.012;
use Benchmark qw(:all);

my $d = 0;
my $c = 2;

sub eins {
    if ( $c == 1) {
        $d = 1;
    }
    else {
        $d = 2;
    }
}

sub zwei {
    if ( $c == 1) {
        $d = 1;
    }
    elsif ( $c == 2 ) {
        $d = 2;
    }
}

sub drei {
    $d = 1;
    $d = 2 if $c == 2;
}

cmpthese( -5, {
    eins => sub{ eins() },
    zwei => sub{ zwei() },
    drei => sub{ drei() },
} );




        Rate eins drei zwei
eins 4167007/s   --  -1% -16%
drei 4207631/s   1%   -- -15%
zwei 4972740/s  19%  18%   --

        Rate eins drei zwei
eins 4074356/s   --  -8% -16%
drei 4428649/s   9%   --  -9%
zwei 4854964/s  19%  10%   --

        Rate eins drei zwei
eins 3455697/s   --  -6% -19%
drei 3672628/s   6%   -- -14%
zwei 4250826/s  23%  16%   --

        Rate eins drei zwei
eins 2832634/s   --  -8% -19%
drei 3088931/s   9%   -- -12%
zwei 3503197/s  24%  13%   --

        Rate eins zwei drei
eins 3053821/s   -- -17% -26%
zwei 3701601/s  21%   -- -10%
drei 4131128/s  35%  12%   --

        Rate eins drei zwei
eins 3033041/s   --  -2% -12%
drei 3092511/s   2%   -- -10%
zwei 3430837/s  13%  11%   --




Summary of my perl5 (revision 5 version 16 subversion 0) configuration:

Platform:
    osname=linux, osvers=3.1.10-1.9-desktop, archname=x86_64-linux
    uname='linux linux1 3.1.10-1.9-desktop #1 smp preempt thu apr 5 18:48:38 utc 2012 (4a97ec8) x86_64 x86_64 x86_64 gnulinux '
    config_args='-de'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.6.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib/../lib64 /usr/lib/../lib64 /lib /usr/lib /lib64 /usr/lib64 /usr/local/lib64
    libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.14.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.14.1'
Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'


Characteristics of this binary (from libperl):
Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
                        PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL
                        USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE
                        USE_LOCALE_COLLATE USE_LOCALE_CTYPE
                        USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
Built under linux
Compiled at May 24 2012 20:53:15
%ENV:
    PERL_HTML_DISPLAY_COMMAND="/usr/bin/firefox -new-window %s"
@INC:
    /usr/local/lib/perl5/site_perl/5.16.0/x86_64-linux
    /usr/local/lib/perl5/site_perl/5.16.0
    /usr/local/lib/perl5/5.16.0/x86_64-linux
    /usr/local/lib/perl5/5.16.0
    .

最佳答案

[这是每个说出的答案,但这是有用的信息,不适合在注释中显示。 ]

首先,让我们并排查看编译后的表单,如果$c == 2,则“ zwei”的执行路径是“ eins”的纯超集。 (标记为“ *”。)

*1  <0> enter                            *1  <0> enter
*2  <;> nextstate(main 4 -e:2) v:{       *2  <;> nextstate(main 4 -e:2) v:{
*3  <#> gvsv[*c] s                       *3  <#> gvsv[*c] s
*4  <$> const[IV 1] s                    *4  <$> const[IV 1] s
*5  <2> eq sK/2                          *5  <2> eq sK/2
*6  <|> cond_expr(other->7) vK/1         *6  <|> cond_expr(other->7) vK/1
 7      <0> enter v                       7      <0> enter v
 8      <;> nextstate(main 1 -e:3) v:{    8      <;> nextstate(main 1 -e:3) v:{
 9      <$> const[IV 1] s                 9      <$> const[IV 1] s
 a      <#> gvsv[*d] s                    a      <#> gvsv[*d] s
 b      <2> sassign vKS/2                 b      <2> sassign vKS/2
 c      <@> leave vKP                     c      <@> leave vKP
            goto d                                   goto d
                                         *e  <#> gvsv[*c] s
                                         *f  <$> const[IV 2] s
                                         *g  <2> eq sK/2
                                         *h  <|> and(other->i) vK/1
*e  <0> enter v                          *i      <0> enter v
*f  <;> nextstate(main 2 -e:6) v:{       *j      <;> nextstate(main 2 -e:6) v:{
*g  <$> const[IV 2] s                    *k      <$> const[IV 2] s
*h  <#> gvsv[*d] s                       *l      <#> gvsv[*d] s
*i  <2> sassign vKS/2                    *m      <2> sassign vKS/2
*j  <@> leave vKP                        *n      <@> leave vKP
*d  <@> leave[1 ref] vKP/REFC            *d  <@> leave[1 ref] vKP/REFC


问题是,我可以重现您的结果! (针对x86_64-linux-thread-multi构建的v5.16.0)

           Rate drei eins zwei
drei  8974033/s   --  -3% -19%
eins  9263260/s   3%   -- -16%
zwei 11034175/s  23%  19%   --

           Rate drei eins zwei
drei  8971868/s   --  -1% -21%
eins  9031677/s   1%   -- -20%
zwei 11333871/s  26%  25%   --


这没有什么不同(可能是CPU缓存的结果),并且可以在不同的运行之间重现(因此,不是影响基准的另一个应用程序)。我很沮丧

每次迭代需要22 ns(1/9031677 s-1/11333871 s)来完成少4个操作。我希望它花费的时间大约少100 ns。

08-26 17:37