问题描述
我正在参加一个在线课程来学习 Python,并且讲师告诉我们链索引不是一个好主意.然而,他没有说出合适的替代品是什么.
I'm taking an online class to learn python and the instructor taught us that chain indexing was not a good idea. However, he failed to tell is the appropriate alternative to use.
假设我有一个 Pandas 数据框,其行索引为 ['1', '2', '3']
,列的名称为 ['a', 'b','c']
.
Suppose I have a Pandas data frame with rows indexed as ['1', '2', '3']
and columns with names ['a', 'b', 'c']
.
使用命令 df['1']['a']
提取在第一行和第一列中找到的值的合适替代方法是什么?
What's the appropriate alternative to using the command df['1']['a']
to extract the value found in the first row and first column?
推荐答案
使用 多轴索引,例如
df.loc['a', '1']
当您使用 df['1']['a']
时,您首先访问系列对象 s = df['1']
,然后访问系列元素s['a']
,导致两个__getitem__
调用,这两个调用都严重过载(处理很多场景,比如切片,布尔掩码索引,等等).
When you use df['1']['a']
, you are first accessing the series object s = df['1']
, and then accessing the series element s['a']
, resulting in two __getitem__
calls, both of which are heavily overloaded (handle a lot of scenarios, like slicing, boolean mask indexing, and so on).
使用 df.loc
索引器效率更高.
It's much more efficient to use the df.loc
indexer.
这篇关于Pandas 链索引的替代方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!