我想从espn网站上抓一张桌子。我似乎无法找到正确的名称来访问它。

python - espn表beautifulsoup,找不到正确的标签,其中的图片-LMLPHP

url="https://www.espn.com/nba/stats/player/_/table/offensive/sort/avgAssists/dir/desc"
import requests
from bs4 import BeautifulSoup
headers={'User-Agent': 'Mozilla/5.0'}
response=requests.get(url,headers=headers)
soup=BeautifulSoup(response.content, 'html.parser')
soup.find_all('table',class_ ="ResponsiveTable ResponsiveTable--fixed-left mt4 Table2__title--remove-capitalization")


该代码只给我一个空列表:(

最佳答案

如果您有table标记,请让Pandas为您完成工作。它在引擎盖下使用BeautifulSoup。

import pandas as pd

url = "https://www.espn.com/nba/stats/player/_/table/offensive/sort/avgAssists/dir/desc"

dfs = pd.read_html(url)

df = dfs[0].join(dfs[1])
df[['Name','Team']] = df['Name'].str.extract('^(.*?)([A-Z]+)$', expand=True)


输出:

print(df.head(5).to_string())
   RK          Name POS  GP   MIN   PTS  FGM   FGA   FG%  3PM  3PA   3P%  FTM  FTA   FT%  REB   AST  STL  BLK   TO  DD2  TD3    PER Team
0   1  LeBron James  SF  35  35.1  24.9  9.6  19.7  48.6  2.0  6.0  33.8  3.7  5.5  67.7  7.9  11.0  1.3  0.5  3.7   28    9  26.10  LAL
1   2   Ricky Rubio  PG  30  32.0  13.6  4.9  11.9  41.3  1.2  3.7  31.8  2.6  3.1  83.7  4.6   9.3  1.3  0.2  2.5   12    1  16.40  PHX
2   3   Luka Doncic  SF  32  32.8  29.7  9.6  20.2  47.5  3.1  9.4  33.1  7.3  9.1  80.5  9.7   8.9  1.2  0.2  4.2   22   11  31.74  DAL
3   4   Ben Simmons  PG  36  35.4  14.9  6.1  10.8  56.3  0.1  0.1  40.0  2.7  4.6  59.0  7.5   8.6  2.2  0.7  3.6   19    3  19.49  PHI
4   5    Trae Young  PG  34  35.1  28.9  9.3  20.8  44.8  3.5  9.4  37.5  6.7  7.9  85.0  4.3   8.4  1.2  0.1  4.8   11    1  23.47  ATL

07-28 08:04