执行此代码段(Python3.6.5)时遇到问题:
dataset = pd.read_csv('C:/dataset/2014_california_eq_metadata.csv', header=0)
dataset = dataset.set_index("TweetID")
print(dataset["TweetID"])
我得到的错误是下面的一个,它是由于第二行代码返回的,因为如果我删除它,一切正常。
Traceback (most recent call last):
File "feature_extraction.py", line 14, in <module>
print(dataset["TweetID"])
File "C:\Python36\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
return self._getitem_column(key)
File "C:\Python36\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
return self._get_item_cache(key)
File "C:\Python36\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
values = self._data.get(item)
File "C:\Python36\lib\site-packages\pandas\core\internals.py", line 3843, in get
loc = self.items.get_loc(item)
File "C:\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'TweetID'
所以,我的问题是:为什么不能使用以下语法访问dataframe列:
dataframe[col_name]
如果指定的列名是数据帧的索引?
有没有其他方法可以得到对应于索引列的pandas系列?
最佳答案
是的,呼叫方式是:
dataset = pd.DataFrame({'TweetID':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3]})
print (dataset)
B C TweetID
0 4 7 a
1 5 8 b
2 4 9 c
3 5 4 d
4 5 2 e
5 4 3 f
dataset = dataset.set_index("TweetID")
print(dataset.index)
Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object', name='TweetID')
对于系列是2个方法-
.index
的Index.to_series
构造函数(如果未使用默认范围索引指定):print(dataset.index.to_series())
TweetID
a a
b b
c c
d d
e e
f f
Name: TweetID, dtype: object
print(pd.Series(dataset.index))
0 a
1 b
2 c
3 d
4 e
5 f
Name: TweetID, dtype: object
如果
Series
则可以按名称指定级别:dataset = dataset.set_index(["TweetID", 'B'])
print(dataset)
C
TweetID B
a 4 7
b 5 8
c 4 9
d 5 4
e 5 2
f 4 3
print(dataset.index.get_level_values('TweetID'))
Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object', name='TweetID')
或按职位:
print(dataset.index.get_level_values(0))
Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object', name='TweetID')
(它也可以使用单个索引,但是有足够的
MultiIndex
)关于python - set_index()之后的Pandas KeyError,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/50332279/