本文介绍了两个正数相乘会在Python 3中产生负输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个DataFrame df1:

I have a DataFrame df1 :

df1.head() = 

             wght          num_links 
id_y  id_x                      
 3     133   0.000203          2      
       186   0.000203          2 
 5     6     0.000203          2      
       98    0.000203          2      
       184   0.000203          2

我需要计算一个名为thr的变量

I need to calculate a variable called thr,

thr = N*(N-1)*2,

其中Ndf1的行数.

问题是当我计算thr时,Python会抛出一个负值(尽管所有输入都是正值):

The problem is that when I calculate thr,Python throws a negative value(although all of the inputs are positive):

ipdb> df1['wght'].count()*(df1['wght'].count()-1)*2
-712569744 

可能的提示

第N行是

ipdb> df1['wght'].count() 
137736 

因此

ipdb> 137736*137735*2
37942135920.

考虑到可以分配给int32的最大值是2147483647,我怀疑NumPy认为type(thr) = <int32>何时应为<int64>.这有意义吗?

Taking into account that the max value that can be assigned to a int32 is 2147483647, I suspect that NumPy considers type(thr) = <int32>, when it should be <int64>. Does this make sense?

请注意,我尚未编写生成df1的代码,因为

Please note that I have not written the code that generates df1 because

ipdb> df1['wght'].count() 
137736

但是,如果需要重现该错误,请告诉我.

However, if it is needed to reproduce the error, let me know.

谢谢.

推荐答案

您正在遇到np.int32溢出,因此只需使用len(df)而不是df.column.count().

You are experiencing np.int32 overflow, so just use len(df) instead of df.column.count().

这是一个小演示:

In [149]: x = pd.DataFrame(np.random.randint(0,100,size=(137736, 3)), columns=list('ABC'))

In [150]: x.A.count() * (x.A.count() - 1) * 2
Out[150]: -712569744

In [151]: len(x) * (len(x) - 1) * 2
Out[151]: 37942135920

In [153]: type(x.A.count())
Out[153]: numpy.int32

In [154]: type(len(x))
Out[154]: int

这篇关于两个正数相乘会在Python 3中产生负输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-28 02:28