问题描述
我的 DataFrames
的大小介于100k和2m之间。我正在处理这个问题的那个很大,但请注意我必须对其他帧做同样的事情:
I have DataFrames
between 100k and 2m in size. the one I am dealing with for this question is this large, but note that I will have to do the same for the other frames:
>>> len(data)
357451
现在这个文件是通过编译许多文件创建的,所以它的索引真的很奇怪。所以我想做的就是用范围(len(数据))
重新索引它,但是我收到了这个错误:
now this file was created by compiling many files, so the index for it is really odd. So all I wanted to do was reindex it with range(len(data))
, but I get this error:
>>> data.reindex(index=range(len(data)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2542, in reindex
fill_value, limit)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2618, in _reindex_index
limit=limit)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 893, in reindex
limit=limit)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 812, in get_indexer
raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects
这实际上没有任何意义。由于我使用包含数字0到357450的数组重新索引,所有Index对象都是唯一的!为什么它是retu发生此错误?
This actually makes no sense. Since I am reindexing with an array containing numbers 0 through 357450, all Index objects are unique! Why is it returning this error?
额外信息:我使用的是python2.7和pandas 11.0
Extra info: I am using python2.7 and pandas 11.0
推荐答案
当它抱怨重新索引仅对具有唯一值的索引
有效时,它并不反对你的 new 索引不是唯一的,它反对你的旧索引不是。
When it complains that Reindexing only valid with uniquely valued Index
, it's not objecting that your new index isn't unique, it's objecting that your old one isn't.
例如:
>>> df = pd.DataFrame(range(5), index = [1,2,3,1,2])
>>> df
0
1 0
2 1
3 2
1 3
2 4
>>> df.reindex(index=range(len(df)))
Traceback (most recent call last):
[...]
File "/usr/local/lib/python2.7/dist-packages/pandas-0.12.0.dev_0bd5e77-py2.7-linux-i686.egg/pandas/core/index.py", line 849, in get_indexer
raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects
但是
>>> df.index = range(len(df))
>>> df
0
0 0
1 1
2 2
3 3
4 4
虽然我想我写了
df.reset_index(drop=True)
而不是。
这篇关于重新索引错误毫无意义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!