我目前正在获得A,A,B,B而不是A,B,A,B的输出。
我真的很想将每个表头的值与每个表数据元素(如字典)相关联。
import requests
from bs4 import BeautifulSoup
courseCode = "IFB104"
page = requests.get("https://www.qut.edu.au/study/unit?unitCode=" + courseCode)
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find_all(class_='table assessment-item')
numOfTables = 0
tableDataArray = []
for tbl in table:
numOfTables = numOfTables + 1
tableDataArray += [tbl.find_all('th'),tbl.find_all('td')]
最佳答案
如果我理解正确,则需要使用dict而不是list:
import requests
from bs4 import BeautifulSoup
courseCode = "IFB104"
page = requests.get("https://www.qut.edu.au/study/unit?unitCode=" + courseCode)
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find_all(class_='table assessment-item')
numOfTables = 0
tableFormatted1 = []
tableFormatted2 = {}
for tbl in table:
numOfTables = numOfTables + 1
keys = tbl.find_all('th')
values = tbl.find_all('td')
new_data = dict(zip(keys, values))
# Method 1
tableFormatted1.append(new_data)
# Method 2
for k, v in new_data.items():
if k in tableFormatted2:
tableFormatted2[k].append(v)
else:
tableFormatted2[k] = [v]
print('List of dictionaries')
print(tableFormatted1)
print('')
print('Dictionary with list')
print(tableFormatted2)
编辑:
tbl
的每次迭代都会覆盖已经完成的迭代。因此,有必要改变结构。我刚刚提供了两种方法。关于python - 循环合并Python数组吗?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/44483977/