问题描述
我正在尝试抓取表格并写入数据框中,他们向我显示了 typeerror
.如何解决这些错误?
I am trying to scraping the table and write in a dataframe they show me a typeerror
. How to resolve these errors?
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.select import Select
from selenium import webdriver
import pandas as pd
temp=[]
driver= webdriver.Chrome('C:Program Files (x86)chromedriver.exe')
driver.get("https://www.fami-qs.org/certified-companies-6-0.html")
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[title='Inline Frame Example']")))
headers=WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@id='sites']//thead"))).text
rows=WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@id='sites']//tbody"))).text
temp.append(rows)
df = pd.DataFrame(temp,columns=headers)
print(df)
在标题中我传递数据 FAMI-QS Number
... Expiry date
而在行中我将传递 FAM-0694
... 2022-09-04
In headers I pass the data FAMI-QS Number
... Expiry date
while in rows I will pass the FAM-0694
... 2022-09-04
推荐答案
刮取所有列中的所有数据,您需要引入 WebDriverWait 用于 visibility_of_element_located() To scrape all the data from all the columns you need to induce WebDriverWait for the visibility_of_element_located() of the 代码块: Code Block: 控制台输出: Console Output: 这篇关于将表抓取并写入数据帧显示我 TypeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!code> 元素,提取 outerHTML,使用
read_html()
读取outerHTML,您可以使用以下定位器策略:<table>
element, extract the outerHTML, read the outerHTML using read_html()
and you can use the following Locator Strategies:driver.get("https://www.fami-qs.org/certified-companies-6-0.html")
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[title='Inline Frame Example']")))
data = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table#sites"))).get_attribute("outerHTML")
df = pd.read_html(data)
print(df)
driver.quit()
[ FAMI-QS Number Site Name City ... Status Certified from Expiry date
0 FAM-1293 AmTech Ingredients albert lea ... Valid 2020-10-08 2023-10-07
1 FAM-0841 3F FEED & FOOD S L vizcolozano ... Valid 2020-04-17 2023-04-16
2 FAM-1361 5N Plus Additives GmbH eisenhüttenstadt ... Valid 2020-10-01 2023-09-30
3 FAM-1301-01 A & V Corp. Limited xiamen ... Valid 2020-09-09 2023-09-08
4 FAM-1146 A. + E. Fischer-Chemie GmbH & Co. KG wiesbaden ... Valid 2020-06-05 2023-06-04
5 FAM-1589 A.M FOOD CHEMICAL CO LIMITED jinan ... Valid 2020-01-07 2023-01-06
6 FAM-0613-01 A.W.P. S.r.l crevalcore ... Valid 2020-02-27 2023-02-07
7 FAM-0867 AB AGRI POLSKA Sp. z o.o. smigiel ... Valid 2020-08-03 2023-03-19
8 FAM-1510-02 AB Vista marlborough ... Valid 2020-04-16 2023-04-15
9 FAM-1510-01 AB Vista * rotterdam ... Valid 2020-04-16 2023-04-15
[10 rows x 7 columns]]