本文介绍了 pandas 中的loc函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!



Can anybody explain why is loc used in python pandas with examples like shown below?

for i in range(0, 2):
  for j in range(0, 3):
    df.loc[(df.Age.isnull()) & (df.Gender == i) & (df.Pclass == j+1),
            'AgeFill'] = median_ages[i,j]


此处建议使用.loc,因为方法df.Age.isnull()df.Gender == idf.Pclass == j+1可能返回数据帧切片的视图或可能会返回副本.这会使大熊猫感到困惑.

The use of .loc is recommended here because the methods df.Age.isnull(), df.Gender == i and df.Pclass == j+1 may return a view of slices of the data frame or may return a copy. This can confuse pandas.


If you don't use .loc you end up calling all 3 conditions in series which leads you to a problem called chained indexing. When you use .loc however you access all your conditions in one step and pandas is no longer confused.



The simple answer is that while you can often get away with not using .loc and simply typing (for example)

df['Age_fill'][(df.Age.isnull()) & (df.Gender == i) & (df.Pclass == j+1)] \
                                                          = median_ages[i,j]


you'll always get the SettingWithCopy warning and your code will be a little messier for it.


In my experience .loc has taken me a while to get my head around and it's been a bit annoying updating my code. But it's really super simple and very intuitive: df.loc[row_index,col_indexer].


For more information see the pandas documentation on Indexing and Selecting Data.

这篇关于 pandas 中的loc函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 15:05