使用XOR加速Python2嵌套循环

本文介绍了使用XOR加速Python2嵌套循环的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的代码旨在根据一系列数字计算哈希值.

My code aims to calculate a hash from a series of numbers.

更容易理解矩阵形式的结构.如果我有从29开始的16个数字，则结构将为:(start = 29，length = 4)

It is easier to understand the structure in the form of a matrix. If I have 16 numbers starting from 29 the structure will be: (start=29, length=4)

29、30、31、32，
33、34、35、36，
37，38，39，40，
41、42、43、44

29, 30, 31, 32,
33, 34, 35, 36,
37, 38, 39, 40,
41, 42, 43, 44

给定的算法指定哈希将是粗体给出的数字的异或:

The given algorithm specifies the the hash will be the XOR of the numbers given in bold:

29，30，31，32，//，
33，34，35，//，36，
37，38，//，39，40，
41，//，42，43，44

29, 30, 31, 32, //,
33, 34, 35, //, 36,
37, 38, //, 39, 40,
41, //, 42, 43, 44

哈希= 29^30^31^32^33^34^35^37^38^39 = 54

我的代码是:

def answer(start, length):
    val=0
    c=0
    for i in range(length):
        for j in range(length):
            if j < length-i:
                val^=start+c
            c+=1
    return val

计算像answer(2000000000,10**4)这样的大值所需的时间太多了.

The time required to compute for large values like answer(2000000000,10**4) is way too much.

约束:

Py2.7.6
仅标准库，除了bz2，crypt，fcntl，mmap，pwd，pyexpat，select，signal，termios，线程，时间，unicodedata，zipimport，zlib之外.
有限的计算时间.

当前计算测试参数(我不知道)给我一个超时错误.

Currently computing the test parameters (unknown to me) give me a timeout error.

如何为更大的值提高代码速度?

How can the speed of my code be improved for bigger values?

推荐答案

在:l递减需要在计算之前完成.这是经过修复的版本，并带有assert测试，以验证其结果是否与朴素算法相同.

There is a bug in the accepted answer to Python fast XOR over range algorithm: decrementing l needs to be done before the XOR calculation. Here's a repaired version, along with an assert test to verify that it gives the same result as the naive algorithm.

def f(a):
    return (a, 1, a + 1, 0)[a % 4]

def getXor(a, b):
    return f(b) ^ f(a-1)

def gen_nums(start, length):
    l = length
    ans = 0
    while l > 0:
        l = l - 1
        ans ^= getXor(start, start + l)
        start += length
    return ans

def answer(start, length):
    c = val = 0
    for i in xrange(length):
        for j in xrange(length - i):
            n = start + c + j
            #print '%d,' % n,
            val ^= n
        #print
        c += length
    return val

for start in xrange(50):
    for length in xrange(100):
        a = answer(start, length)
        b = gen_nums(start, length)
        assert a == b, (start, length, a, b)

在start和length的范围内，gen_nums大约是answer的5倍，但是我们可以使其再次大约快两倍(即，大约是answer的十倍) )，消除了这些函数调用:

Over those ranges of start and length, gen_nums is about 5 times faster than answer, but we can make it roughly twice as fast again (i.e., roughly 10 times as fast as answer) by eliminating those function calls:

def gen_nums(start, length):
    ans = 0
    for l in xrange(length - 1, -1, -1):
        b = start + l
        ans ^= (b, 1, b + 1, 0)[b % 4] ^ (0, start - 1, 1, start, 0)[start % 4]
        start += length
    return ans

正如Mirek Opoka在评论中提到的那样，% 4等同于& 3，并且它更快，因为按位算术比执行整数除法和舍弃商快.因此我们可以将核心步骤替换为

As Mirek Opoka mentions in the comments, % 4 is equivalent to & 3, and it's faster because bitwise arithmetic is faster than performing integer division and throwing away the quotient. So we can replace the core step with

ans ^= (b, 1, b + 1, 0)[b & 3] ^ (0, start - 1, 1, start, 0)[start & 3]

这篇关于使用XOR加速Python2嵌套循环的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！