我有代码,允许我返回从一个特定的网站,给定一个关键字的所有搜索部分。
当使用搜索词“HL4RPV-50”时,我可以按预期返回所有值。
当我使用搜索词“FSJ4-50B”时,该行的aNoSuchElementException

    ---> 53     price = product.find_element_by_xpath(".//div[@class='price']").text.split('\n')[1]

直接的XPATH是:
    //*[@id="search"]/div[3]/div[2]/div[2]/div[2]/div[6]/div[2]/div[1]/div[1]/div/div[4]/div/add-product-to-cart/div[1]

两个部分ID的直接XPATH不同。此外,每个部分ID根据给定结果的部分位置有一个稍微不同的XPATH。
在我的印象中,我可以引用相对的XPATH来解决这个问题。
我试图从中删除的站点是Tessco.com并且在下面的代码中指定了通用UN/PW。
标识XPATH ID:
为了生成一个通用的XPATH,我在印象中使用了.来选择当前节点,并使用//来从文档中的当前节点中选择与所选内容匹配的节点,不管它们在哪里。
然后我指定了它的类型,这里是div然后@class='price'
对于“HL4RPV-50”这给了我想要的,对于“FSJ4-50B”它没有。
我相信我有错误的XPATH,但不确定如何概括它。
有什么建议吗?
代码:
    import time
    #Need Selenium for interacting with web elements
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    #Need numpy/pandas to interact with large datasets
    import numpy as np
    import pandas as pd

    chrome_path = r"C:\Users\James\Documents\Python Scripts\jupyterNoteBooks\ScrapingData\chromedriver_win32\chromedriver.exe"
    driver = webdriver.Chrome(chrome_path)
    driver.get("https://www.tessco.com/login")

    userName = "FirstName.SurName321123@gmail.com"
    password = "PasswordForThis123"

    #Set a wait, for elements to load into the DOM
    wait10 = WebDriverWait(driver, 10)
    wait20 = WebDriverWait(driver, 20)
    wait30 = WebDriverWait(driver, 30)

    elem = wait10.until(EC.element_to_be_clickable((By.ID, "userID")))
    elem.send_keys(userName)

    elem = wait10.until(EC.element_to_be_clickable((By.ID, "password")))
    elem.send_keys(password)

    #Press the login button
    driver.find_element_by_xpath("/html/body/account-login/div/div[1]/form/div[6]/div/button").click()

    #Expand the search bar
    searchIcon = wait10.until(EC.element_to_be_clickable((By.XPATH, "/html/body/header/div[2]/div/div/ul/li[2]/i")))
    searchIcon.click()

    searchBar = wait10.until(EC.element_to_be_clickable((By.XPATH, '/html/body/header/div[3]/input')))
    searchBar.click()

    #load in manufacture part number from a collection of components, via an Excel file

    #Enter information into the search bar
    searchBar.send_keys("FSJ4-50B" + '\n')

    # wait for the products information to be loaded
    products = wait30.until(EC.presence_of_all_elements_located((By.XPATH,"//div[@class='CoveoResult']")))
    # create a dictionary to store product and price
    productInfo = {}
    # iterate through all products in the search result and add details to dictionary
    for product in products:
        # get product name
        productName = product.find_element_by_xpath(".//a[@class='productName CoveoResultLink hidden-xs']").text
        # get price
        price = product.find_element_by_xpath(".//div[@class='price']").text.split('\n')[1]
        # add details to dictionary
        productInfo[productName] = price
    # print products information
    print(productInfo)

    #time.sleep(5)
    driver.close()

最佳答案

这是工作代码
我禁用了这些图像,因为我的Internet连接速度很慢,而且网站需要时间来加载页面。
我使用css选择器代替xPath作为price,它可以完全工作>

import time
#Need Selenium for interacting with web elements
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
#Need numpy/pandas to interact with large datasets
import numpy as np
import pandas as pd

chrome_path = r".\web_driver\chromedriver.exe"
chrome_options = webdriver.ChromeOptions()
prefs = {"profile.managed_default_content_settings.images": 2}
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(chrome_path, chrome_options=chrome_options)
driver.maximize_window()
driver.get("https://www.tessco.com/login")

userName = "FirstName.SurName321123@gmail.com"
password = "PasswordForThis123"

#Set a wait, for elements to load into the DOM
wait10 = WebDriverWait(driver, 10)
wait20 = WebDriverWait(driver, 20)
wait30 = WebDriverWait(driver, 30)

elem = wait10.until(EC.element_to_be_clickable((By.ID, "userID")))
elem.send_keys(userName)

elem = wait10.until(EC.element_to_be_clickable((By.ID, "password")))
elem.send_keys(password)

#Press the login button
driver.find_element_by_xpath("/html/body/account-login/div/div[1]/form/div[6]/div/button").click()

#Expand the search bar
# searchIcon = wait10.until(EC.element_to_be_clickable((By.XPATH, "")))
# searchIcon.click()

searchBar = wait10.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#searchBar input")))

#Enter information into the search bar
searchBar.send_keys("FSJ4-50B")
driver.find_element_by_css_selector('a.inputButton').click()
time.sleep(5)

# wait for the products information to be loaded
products = driver.find_elements_by_xpath( "//div[@class='CoveoResult']")
# create a dictionary to store product and price
productInfo = {}
# iterate through all products in the search result and add details to dictionary
for product in products:
    # get product name
    productName = product.find_element_by_xpath("//a[@class='productName CoveoResultLink hidden-xs']").text
    # get price
    price = product.find_element_by_css_selector("div.price").text.split('\n')[1]
    # add details to dictionary
    productInfo[productName] = price
# print products information
print(productInfo)
#time.sleep(5)
driver.close()

输出:
{"8' Jumper-FSJ4-50B NM/NM": '$147.55'}

编辑:
如何选择选择器
python - Python:Selenium,通用XPATH上的NoSuchElementException-LMLPHP
正如您在上面的截图中看到的,我将鼠标悬停在搜索栏上,发现它有一个ID,我们知道ID始终是网页上唯一的元素,因此我们还可以使用:
driver.find_element_by_id("searchBar")

但要到达输入字段,我更喜欢css_选择器,然后发送键。
要查找css选择器:
对于a.inputButtoncss选择器,请参见选择搜索按钮,您将在dom中看到以下html:
<a class="CoveoSearchButton inputButton button"><span class="coveo-icon">Search</span><i class="fa fa-search" aria-hidden="true"></i></a>

我们知道a.button是锚标记,从上面的html,我们可以推断css_选择器之一可以是:
a.inputButton

注意
但这在这里是唯一的,在这种情况下,有时同一个类名可以在同一页面上的不同元素中多次使用,因此必须使用较高级别的节点才能到达子CSS元素节点。例如,<a>也可以遍历为:
搜索按钮的另一个css\u选择器
div.divCoveoSearchbox > a.inputButton

因为a.inputButton是inputButton锚标记的父元素。
我希望我明白你的意思了?

10-02 05:09
查看更多