本文介绍了For循环不断迭代列表中的第一条数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此代码从 https://www.asx抓取HTML表. com.au/asx/statistics/prevBusDayAnns.do 并下载特定ASX代码和标题的PDF文件.当for循环遍历数据"中找到的ASX代码时,它将遍历第一个ASX代码五次,从而创建同一PDF的五个副本.例如,在下面的代码中,将有五份TWD. for循环在第一个ASX代码上迭代的次数等于数据"中的ASX代码的次数.例如,如果有十个代码,我最终将得到十个TWD的PDF文件副本.这仅发生在第一个ASX代码上,其他所有都很好.发生这种情况的任何原因?

This code scrapes the HTML table from https://www.asx.com.au/asx/statistics/prevBusDayAnns.do and downloads PDF files for specific ASX Codes and Headlines. When the for loop iterates over the ASX Codes found in 'data', it iterates over the first ASX Code five times which creates five duplicate of the same PDF. For example, in the code below there would be five copies of TWD. The amount of times the for loop iterates over the first ASX code is equal to the amount of ASX Codes in 'data'. For example, if there were ten codes, I would end up with ten copies of PDF files for TWD. This only happens to the first ASX Code, everything else is fine. Any reason why this is happening?

相关代码:

driver.get("https://www.asx.com.au/asx/statistics/prevBusDayAnns.do")
data = ['TWD', 'GEM', 'AT1','TKF','GDF']
asxcodes = []
for d in data:
    try:
       asxcode = driver.find_element_by_xpath("//table//tr//td[text()='{}']/following-sibling::td[3]/a[contains(.,'{}')]".format(d,"Becoming a substantial holder")).get_attribute("href")
       asxcodes.append(asxcode)
    except:
        pass

完整代码:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time
chromeOptions = webdriver.ChromeOptions()
prefs = {"plugins.always_open_pdf_externally": True,"download.default_directory" : r"C:\Users\Harrison Pollock\Desktop\The Smarts\Becoming a Substantial Holder"}
chromeOptions.add_experimental_option("prefs",prefs)
chromedriver = r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(executable_path=r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe",chrome_options=chromeOptions)
driver.get("https://www.asx.com.au/asx/statistics/prevBusDayAnns.do")
data = ['TWD', 'GEM', 'AT1','TKF','GDF'
asxcodes = []
for d in data:
    try:
       asxcode = driver.find_element_by_xpath("//table//tr//td[text()='{}']/following-sibling::td[3]/a[contains(.,'{}')]".format(d,"Becoming a substantial holder")).get_attribute("href")
       asxcodes.append(asxcode)
    except:
        pass
for asxcode in asxcodes:
    driver.get(asxcode)
    WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.XPATH, "//input[@value='Agree and proceed']"))).click()
    time.sleep(10)

推荐答案

除了获取所有href值然后进行迭代外,您还可以尝试单击每个链接,然后单击进行下载.

Instead of getting all href value and then iterate could you try something like click on each link and then click for download.

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time
chromeOptions = webdriver.ChromeOptions()
prefs = {"plugins.always_open_pdf_externally": True,"download.default_directory" : r"C:\Users\Harrison Pollock\Desktop\The Smarts\Becoming a Substantial Holder"}
chromeOptions.add_experimental_option("prefs",prefs)
chromedriver = r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(executable_path=r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe",chrome_options=chromeOptions)
driver.get("https://www.asx.com.au/asx/statistics/prevBusDayAnns.do")
data = ['TWD', 'GEM', 'AT1','TKF','GDF']
asxcodes = []
for d in data:
    try:
       driver.find_element_by_xpath("//table//tr//td[text()='{}']/following-sibling::td[3]/a[contains(.,'{}')]".format(d,"Becoming a substantial holder")).click()
       WebDriverWait(driver,5).until(EC.number_of_windows_to_be(2))
       driver.switch_to.window(driver.window_handles[-1])
       WebDriverWait(driver,5).until(EC.element_to_be_clickable((By.XPATH, "//input[@value='Agree and proceed']"))).click()
       time.sleep(10)
       driver.close()
       driver.switch_to.window(driver.window_handles[-1])
    except:
        driver.switch_to.window(driver.window_handles[-1])
        continue

希望这种逻辑有所帮助.

Hope this logic helps.

这篇关于For循环不断迭代列表中的第一条数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 11:28