本文介绍了pd.qcut-ValueError:容器边缘必须唯一的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的数据是此处.
q = pd.qcut(df['loss_percent'], 10)
ValueError: Bin edges must be unique: array([ 0.38461538, 0.38461538, 0.46153846, 0.46153846, 0.53846154,
0.53846154, 0.53846154, 0.61538462, 0.69230769, 0.76923077, 1. ])
我已阅读为什么使用-pandas-qcut-return-valueerror ,但是我仍然感到困惑.
I have read through why-use-pandas-qcut-return-valueerror, however I am still confused.
我想象我的一个值出现的频率很高,并且破坏了qcut.
I imagine that one of my values has a high frequency of occurrence and that is breaking qcut.
首先,步骤是如何确定是否确实如此,以及哪个值是问题所在.最后,对于我的数据,哪种解决方案是合适的.
First, step is how do I determine if that is indeed the case, and which value is the problem. Lastly, what kind of solution is appropriate given my data.
推荐答案
在帖子 https://stackoverflow.com中使用解决方案/a/36883735/2336654
def pct_rank_qcut(series, n):
edges = pd.Series([float(i) / n for i in range(n + 1)])
f = lambda x: (edges >= x).argmax()
return series.rank(pct=1).apply(f)
q = pct_rank_qcut(df.loss_percent, 10)
这篇关于pd.qcut-ValueError:容器边缘必须唯一的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!