本文介绍了我有一个舍入误差? Perl的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的脚本是应该做以下。它需要标量的旧列表并使得数的新的,相应的列表。旧的列表被称为@oldMarkers和新的列表作为@newMarkers

样品输入,如: CHR1,CHR2,IMP,CHR 3,IMP,IMP,IMP,CHR4

示例输出是这样的: 1,2,2.1,3,3.1,3.2,3.3,4

的脚本的点是读@oldMarkers和输出,其中包含的字母的元素的每个实例字符的整数被推入阵列@newMarkers列表的列表。
在@oldMarkers IMP的每个实例,十进制数被加到@newMarkers。新的十进制数具有相同的碱整数为preceding号码,但有.1添加到它。换言之,IMP的多个后续实例都应该具有相同的整数作为最近读字符项,以附加了一个十进制值计数对应于该最最近的字符的IMP的数目条目。

下面的脚本作品几乎100%。它甚至通常在下列情况下工作。在@oldMarkers一些地方,也有很多IMP条目。当有超过10个IMP都在排,code被认为能值推入@newMarkers使条目该块的所有的进出口S具有相同的整数,这也相应的号码匹配最近在读@oldMarkersCHR的实例。该整数,0.1加入。而当十进制的值上.9小数从头再来回.1,并从那里往上走,直至IMP项舒展的结束。

例如,如果@oldMarkers具有13IMP个块并且是:
CHR1,CHR2,IMP,IMP,IMP,IMP,IMP,IMP,IMP,IMP,IMP,IMP,IMP,IMP,IMP,CHR2

然后@newMarkers应该是:
1,2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,2.1,2.2,2.3,2.4,3

脚本摘要:

原始文件包含两个元件的多行。第一元件并不重要,并且因此在code为跳过。每一行的第二个元素是一个ID,要么像CHR4或IMP。 A ,而循环读取每一行,增加了第二个元素的数组@oldMarkers。

按条目,然后,该阵列读条目。脚本第一askes是否在@newMarkers corrsponds一个字符或从原始@oldMarker列表中的一个IMP的条目。这与第一个如果其他设置完成。

接着,对两者的条件下,该条目被进一步询问是否从对应于字符或IMP条目的数目本身如下。这是用做嵌入式如果其他设置与第一组这样的。

然后新元素被定义并被推入@newMarker,根据条件

就像我说的,这主要是工作。但是,有时当IMP的绵延超过10个,脚本不再循环的小数。相反,它增加了.1至preceeding值,并进入一个新的整数的整数。但对于超过10延伸等,它工作正常。正是凭借这种不一致的错误。

你能发现问题吗?

 我@oldMarkers =();
我@newMarkers =();而(我的$行=< $文件>)
    {
    的Chomp $线;
    我@entries =拆分('\\ T',$线);
    推(@oldMarkers,$项[1]);
    而} ###结束
为(我的$ I = 0; $ I<标@oldMarkers; $ I ++)
    {
     如果($ oldMarkers [$ i] =〜M / CHR /)###是一个标记
        {
         如果($ oldMarkers [$ I - 1] =〜M / IMP /)###新的标记来估算现场后,
            {
             推(@newMarkers,INT($ newMarkers [$ I - 1])+ 1);
            }       其他###是一个标记后到来
           {
            推(@newMarkers,$ newMarkers [$ I - 1] + 1);
           }      }如果###   其他###是一个估算的网站
      {
       如果($ oldMarkers [$ I - 1] =〜M / IMP /)###估算网站陆续估算网站
          {
           我的$值= $ newMarkers [$ I - 1] - INT($ newMarkers [$ I - 1]);           如果($值< .9)
                {
                 推(@newMarkers,$ newMarkers [$ I - 1] + 0.1);
                }          ELSIF($值> 0.9)
                {
                 推(@newMarkers,INT($ newMarkers [$ I - 1])+ 0.1);
                }
        }如果###   其他###归咎于网站是一个标记后,
        {
         推(@newMarkers,INT($ newMarkers [$ I - 1])+ 0.1);
        }    }其他###} ###对
打印$ newMarkerfile加入(\\ t的,@newMarkers);


解决方案

这将是更容易和更可靠的做到这一点只用整数运算。基本上,跟踪的两个整数值:一个数 ,一个用于之后的数字前。如果后位的 10到达,将其重置为1:

 我@newMarkers;
我的$ chrCount = 0;
我的$ impCount = 0;我的foreach $标记(@oldMarkers){
    如果($标记=〜/ ^ CHR \\ D + $ /){
        $ chrCount ++;
        $ impCount = 0;
        推@newMarkers,$ chrCount;
    } ELSIF($标记情商IMP){
        $ impCount ++;
        $ impCount = 1,如果$ impCount == 10;
        推@newMarkers,$ chrCount $ impCount;
    }其他{
        死无法识别的标记$标记;
    }
}

()

My script is supposed to do following. It takes an old list of scalars and makes a new, corresponding list of numbers. The old list is referred to as @oldMarkers and the new list as @newMarkers.

Sample input is like: chr1, chr2, IMP, chr3, IMP, IMP, IMP, chr4

Sample output is like: 1, 2, 2.1, 3, 3.1, 3.2, 3.3, 4

The point of the script is to read the list of @oldMarkers and output a list where for each instance of an element containing the letters "chr," an integer is pushed into the array @newMarkers.For each instance of IMP in @oldMarkers, a decimal number is added to @newMarkers. The new decimal number has the same "base integer" as the preceding number but has .1 added to it. In other words, multiple succeeding instances of "IMP" are supposed to have the same whole number as the most recently read "chr" entry, with a decimal value tacked on that counts the number of IMPs that correspond to that most recent "chr" entry.

The script below works almost 100%. It is even usually working in the following instance. In some places in @oldMarkers, there are numerous entries for IMP. When there are more than 10 IMPs in a row, the code is supposed to push values into @newMarkers so that all the "IMP"s of that block of entries have the same whole number, which also matches the number corresponding to the most recently read instance of "chr" in the @oldMarkers. To that whole number, 0.1 is added. And when the value of the decimal gets to .9, the decimals "start over" back to .1 and go up from there, until the end of the stretch of IMP entries.

For example, if @oldMarkers has a block of 13 "IMP"s and is:chr1, chr2, IMP, IMP, IMP, IMP, IMP, IMP, IMP, IMP, IMP, IMP, IMP, IMP, IMP, chr2

then @newMarkers should be:1, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 2.1, 2.2, 2.3, 2.4, 3

Summary of the script:

The original file contains multiple lines of two elements. The first element is not important, and so is skipped over in the code. The second element of each line is an ID, either something like "chr4" or "IMP". A while loop reads each line, adds the second element to the array @oldMarkers.

Then, this array is read entry by entry. The script first askes whether the entry in @newMarkers corrsponds to a "chr" or an "IMP" from the original @oldMarker list. This is done with the first if and else set.

Next, for both conditions, the entry is further asked whether it follows from a number itself corresponding to a "chr" or "IMP" entry. This is done with the embedded if and else sets with in the first such set.

Then new elements are defined and are pushed into @newMarker, depending on the conditions.

Like I said, this mostly works. Sometimes, however, when IMP's stretch for more than 10, the script does not "recycle" the decimals. Rather, it adds .1 to the preceeding value and enters a new whole number integer. But for other stretches that exceed 10, it works fine. It is inconsistent with this "error."

Can you spot the problem?

my @oldMarkers = ();
my @newMarkers = ();

while ( my $line = <$FILE> )
    {
    chomp $line;
    my @entries = split( '\t', $line );
    push( @oldMarkers, $entries[ 1 ] );
    } ### end of while


for ( my $i = 0 ; $i < scalar @oldMarkers   ; $i++ )
    {
     if ( $oldMarkers[ $i ] =~ m/chr/ ) ### is a marker
        {
         if ( $oldMarkers[ $i - 1 ] =~ m/IMP/ ) ### new marker comes after imputed site
            {
             push( @newMarkers, int( $newMarkers[ $i - 1 ] ) + 1 );
            }

       else  ### is coming after a marker
           {
            push( @newMarkers, $newMarkers[ $i - 1 ] + 1 );
           }

      } ### if

   else    ### is an imputed site
      {
       if ( $oldMarkers[ $i - 1 ] =~ m/IMP/ ) ### imputed site is after another imputed site
          {
           my $value = $newMarkers[ $i - 1 ] - int( $newMarkers[ $i - 1 ] );

           if ( $value < .9 )
                {
                 push( @newMarkers, $newMarkers[ $i - 1 ] + .1 );
                }

          elsif ( $value > .9 )
                {
                 push( @newMarkers, int( $newMarkers[ $i - 1 ] ) + .1  );
                }


        } ### if

   else ### imputed site is after a marker
        {
         push( @newMarkers, int( $newMarkers[ $i - 1 ] ) + .1 );
        }

    } ### else

} ### for


print $newMarkerfile join( "\t", @newMarkers);
解决方案

It would be easier and more reliable to do this using only integer arithmetic. Basically, keep track of two integer values: one for the number before the . and one for the digit after it. If the digit after the . reaches 10, reset it to 1:

my @newMarkers;
my $chrCount = 0;
my $impCount = 0;

foreach my $marker (@oldMarkers) {
    if ( $marker =~ /^chr\d+$/ ) {
        $chrCount++;
        $impCount = 0;
        push @newMarkers, $chrCount;
    } elsif ( $marker eq "IMP" ) {
        $impCount++;
        $impCount = 1 if $impCount == 10;
        push @newMarkers, "$chrCount.$impCount";
    } else {
        die "Unrecognized marker $marker";
    }
}

(demo on codepad.org)

这篇关于我有一个舍入误差? Perl的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 12:42