使用pandas创建一个对象
In [1]: import pandas as pd In [2]: import numpy as np In [3]: df = pd.DataFrame(np.random.randn(6,4),index=pd.date_range('',periods=6),columns=list('ABCD')) In [4]: df
Out[4]:
A B C D
2018-01-01 -0.603510 0.269480 0.197354 -0.433003
2018-01-02 1.230502 0.474616 1.473517 -0.627363
2018-01-03 -0.402034 0.569097 0.675872 -0.317995
2018-01-04 0.220638 0.527543 -1.140620 -0.348089
2018-01-05 -2.494331 0.593269 0.596578 1.653347
2018-01-06 -2.766239 -0.919777 0.462890 0.156048
如果你想得到第三行的数据:
如果你沿袭之前python切片的习惯,想直接取,那么需要改变一下方式。
KeyError Traceback (most recent call last)
D:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3062 try:
-> 3063 return self._engine.get_loc(key)
3064 except KeyError: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 2 During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last)
<ipython-input-5-b5f2749c85df> in <module>()
----> 1 df[2] D:\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2683 return self._getitem_multilevel(key)
2684 else:
-> 2685 return self._getitem_column(key)
2686
2687 def _getitem_column(self, key): D:\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
2690 # get column
2691 if self.columns.is_unique:
-> 2692 return self._get_item_cache(key)
2693
2694 # duplicate columns & possible reduce dimensionality D:\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
2484 res = cache.get(item)
2485 if res is None:
-> 2486 values = self._data.get(item)
2487 res = self._box_item_values(item, values)
2488 cache[item] = res D:\Anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
4113
4114 if not isna(item):
-> 4115 loc = self.items.get_loc(item)
4116 else:
4117 indexer = np.arange(len(self.items))[isna(self.items)] D:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3063 return self._engine.get_loc(key)
3064 except KeyError:
-> 3065 return self._engine.get_loc(self._maybe_cast_indexer(key))
3066
3067 indexer = self.get_indexer([key], method=method, tolerance=tolerance) pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 2
df[2]存在语法错误
正确的做法其实有好多种:
方法1:
In [6]: df[2:3]
Out[6]:
A B C D
2018-01-03 -0.402034 0.569097 0.675872 -0.317995
方法2:
vIn [7]: df['':''] #这里必须使用这种方式,不然会有语法错误
Out[7]:
A B C D
2018-01-03 -0.402034 0.569097 0.675872 -0.317995
刚才使用类似python单个切片的方式貌似不行,所以就要说到今天的重点,loc、iloc、ix
(1).loc:按照标签进行取值
In [8]: df.loc['2018/01/03']
Out[8]:
A -0.402034
B 0.569097
C 0.675872
D -0.317995
Name: 2018-01-03 00:00:00, dtype: float64
(2).iloc:按照标签进行取值
In [9]: df.iloc[2]
Out[9]:
A -0.402034
B 0.569097
C 0.675872
D -0.317995
Name: 2018-01-03 00:00:00, dtype: float64
(3)ix:混合缩影
In [10]: df.ix['2018/01/03']
Out[10]:
A -0.402034
B 0.569097
C 0.675872
D -0.317995
Name: 2018-01-03 00:00:00, dtype: float64 In [11]: df.ix[2]
Out[11]:
A -0.402034
B 0.569097
C 0.675872
D -0.317995
Name: 2018-01-03 00:00:00, dtype: float64