我有一个数据框:
Energy Supply Energy Supply per Capita % Renewable
Country
Afghanistan 3.210000e+08 10 78.669280
Albania 1.020000e+08 35 100.000000
British Virgin Islands 2.000000e+06 85 0.000000
...
Aruba 1.200000e+07 120 14.870690 ...
Excludes the overseas territories. NaN NaN NaN
Data exclude Hong Kong and Macao Special Admini... NaN NaN NaN
Data on kerosene-type jet fuel include aviation... NaN NaN NaN
For confidentiality reasons, data on coal and c... NaN NaN NaN
Data exclude Greenland and the Danish Faroes. NaN NaN NaN
我曾经使用
df = pd.read_excel(filelink, skiprows=16)
在文件的开头剪切掉不需要的信息,但是如何摆脱df末尾的“ noize”信息呢?我曾试图将列表传递给飞禽类,但结果搞砸了。
最佳答案
似乎您需要在skip_footer = 5
中使用参数read_excel
:
skip_footer:int,默认0
末尾要跳过的行(0索引)
样品:
df = pd.read_excel('myfile.xlsx', skip_footer = 5)
print (df)
Country Energy Supply Energy Supply per Capita \
0 Afghanistan 321000000.0 10
1 Albania 102000000.0 35
2 British Virgin Islands 2000000.0 85
3 Aruba 12000000.0 120
% Renewable
0 78.66928
1 100.00000
2 0.00000
3 14.87069
另一种解决方案是使用
NaN
除去某些列中所有dropna
的所有行:df = pd.read_excel('myfile.xlsx')
cols = ['Energy Supply','Energy Supply per Capita','% Renewable']
df = df.dropna(subset=cols, how='all')
print (df)
Country Energy Supply Energy Supply per Capita \
0 Afghanistan 321000000.0 10.0
1 Albania 102000000.0 35.0
2 British Virgin Islands 2000000.0 85.0
3 Aruba 12000000.0 120.0
% Renewable
0 78.66928
1 100.00000
2 0.00000
3 14.87069
关于python - 如何在xls末尾跳过pandas数据框中的行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/43908462/