我正在尝试从包含多个表的this URL刮取内容。所需的输出将是:

NAME        FG% FT% 3PM REB AST STL BLK TO  PTS     SCORE
Team Jackson (0-8)      .4313   .7500   21  71  34  11  12  15  189     1-8-0
Team Keyrouze (4-4)     .4441   .8090   31  130 71  18  13  45  373     8-1-0
Nutz Vs. Draymond Green (4-4)       .4292   .8769   30  86  66  15  9   28  269     3-6-0
Team Pauls 2 da Wall (3-5)      .4784   .8438   40  123 64  18  20  30  316     6-3-0
Team Noey (2-6)     .4350   .7679   21  125 62  20  9   33  278     7-2-0
YOU REACH, I TEACH (2-5-1)      .4810   .7432   20  114 56  30  7   50  277     2-7-0
Kris Kaman His Pants (5-3)      .4328   .8000   20  74  59  20  5   27  238     3-6-0
Duke's Balls In Daniels Face (3-4-1)        .5000   .7045   42  139 38  27  22  30  303     6-3-0
Knicks Tape (5-3)       .5000   .8152   34  143 92  12  9   47  397     4-5-0
Suck MyDirk (5-3)       .4734   .8814   29  106 86  22  17  40  435     5-4-0
In Porzingod We Trust (4-4)     .4928   .7222   27  180 95  16  16  46  423     7-2-0
Team Aguilar (6-1-1)        .4718   .7053   28  177 65  12  35  48  413     2-7-0
Team Li (7-0-1)     .4714   .8118   35  134 74  17  17  47  368     6-3-0
Team Iannetta (4-4)     .4527   .7302   22  125 90  20  13  44  288     3-6-0


如果用这种方式格式化表格太困难了,我想知道如何抓取所有表格?我刮所有行的代码是这样的:

tableStats = soup.find('table', {'class': 'tableBody'})
rows = tableStats.findAll('tr')

for row in rows:
    print(row.string)


但是它只显示值“ TEAM”,什么也没有显示...为什么它不包含表中的所有行?

谢谢。

最佳答案

代替寻找table标记,您应该直接寻找具有更高可靠性的class(例如linescoreTeamRow)的行。该代码段可以解决问题,

from bs4 import BeautifulSoup
import requests
a = requests.get("http://games.espn.com/fba/scoreboard?leagueId=224165&seasonId=2017")
soup = BeautifulSoup(a.text, 'lxml')
# searching for the rows directly
rows = soup.findAll('tr', {'class': 'linescoreTeamRow'})
# you will need to isolate elements in the row for the table
for row in rows:
    print row.text

07-28 09:31