我的问题是,我需要网格中包含网站https://applipedia.paloaltonetworks.com的子域的所有数据-(包含NAME,CATEGORY,SUBCATEGORY,RISK,TECHNOLOGY的数据)。我需要的是[示例:在第5行中:2ch有2个子域| _2ch-base和2ch-posting。像这样,我只想获取具有子域的所有应用程序的列表]

对,不是在我尝试添加任何内容的时候:

table =wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,    'tbody#bodyScrollingTable tr')))


我收到超时错误。

下面是我到目前为止拥有的脚本,该脚本可以从网格中获取所有数据,但是我只需要应用程序,并且包含子域。[示例2ch,2ch-base,2ch-posting]。我通过检阅元素发现了一种模式,即所有不具有子域的应用程序都具有(),或者我们可以通过()字段进行查找,这对于所有具有子域的应用程序都是常见的。任何解决此问题的帮助将不胜感激。

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver   = webdriver.Chrome(executable_path = r'/Users/am/Downloads/chromedriver')
driver.maximize_window()

driver.get("https://applipedia.paloaltonetworks.com/")

wait = WebDriverWait(driver,30)

table =wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,    'tbody#bodyScrollingTable tr')))

for tab in table:
  print(tab.text)

最佳答案

按照url https://applipedia.paloaltonetworks.com/以获得具有子域的所有应用程序的列表,您需要诱导WebDriverWait使所需的元素可见,并且可以使用以下解决方案:


代码块:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
options.add_argument("--disable-gpu")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\ChromeDriver\chromedriver_win32\chromedriver.exe')
driver.get('https://applipedia.paloaltonetworks.com/')
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='btmTable' and @id='dataTable']//tbody[@id='bodyScrollingTable']//tr[not(@ottawagroup='0') and not(@ottawagroup='2')]/td/a")))
for element in elements:
    print(element.get_attribute("innerHTML"))

控制台输出:

DevTools listening on ws://127.0.0.1:12927/devtools/browser/d4a5d576-a4b0-4a3d-959b-9d37aff36fc6

                                2ch


                                51.com


                                adobe-connect


                                adobe-connectnow


                                adobe-creative-cloud


                                aim


                                aim-express


                                ali-wangwang


                                amazon-cloud-drive


                                amazon-music


                                ameba-now


                                assembla


                                autodesk360


                                avaya-webalive


                                bacnet


                                baidu-hi


                                bebo


                                bitbucket


                                boxnet


                                buddybuddy


                                chinaren


                                cisco-spark


                                cloudapp


                                cloudforge


                                cloudinary


                                concur


                                confluence


                                convo


                                cyph


                                daum


                                dcinside


                                diameter


                                dnp3


                                dochub


                                docstoc


                                docusign


                                draw.io


                                dropbox


                                egnyte


                                evernote


                                facebook


                                fetion


                                filestack


                                flickr


                                flixwagon


                                fuze-meeting


                                gatherplace


                                genesys


                                git


                                github


                                gitlab


                                glassdoor


                                globalmeet


                                gmail


                                google-calendar


                                google-cloud-storage


                                google-docs


                                google-hangouts


                                google-plus


                                google-spaces


                                google-talk


                                google-translate


                                google-video


                                gotomypc


                                gotowebinar


                                gtp


                                hadoop


                                hightail


                                hipchat


                                hootsuite


                                huddle


                                hulu


                                hyves


                                iccp


                                icloud


                                iec-60870-5-104


                                imeet


                                imgur


                                instagram


                                instan-t


                                ip-messenger


                                ipsec


                                irc


                                issuu


                                itunes


                                jira


                                join-me


                                jumpshare


                                kaixin


                                kaixin001


                                kakaotalk


                                laiwang


                                landesk


                                linkedin


                                live-mesh


                                lotus-notes


                                lotuslive


                                lucidpress


                                mail.ru


                                mail.ru-agent


                                maytech


                                meebo


                                meetup


                                mega


                                mendeley


                                mercurial


                                mixi


                                modbus


                                ms-ds-smb


                                ms-lync


                                ms-office365


                                ms-onedrive


                                msn


                                myspace


                                nateon-im


                                netease-webdisk


                                netflix


                                ning


                                noteworthy


                                now-tv


                                odnoklassniki


                                onehub


                                owncloud


                                paltalk


                                pastebin


                                pcanywhere


                                pinterest


                                pivotaltracker


                                powow


                                prezi


                                proofhub


                                qik


                                qliksense-cloud


                                qq


                                quip


                                quora


                                rally-software


                                readytalk


                                reddit


                                rediffbol


                                renren


                                rtp


                                salesforce


                                sap-jam


                                screencast


                                scribd


                                second-life


                                secure-data-space


                                sendthisfile


                                service-now


                                sharefile


                                sharepoint


                                sharevault


                                showmax


                                siemens-s7


                                signiant


                                sina-uc


                                sina-weibo


                                skydrive


                                slack


                                slideshare


                                smartsheet


                                snmp


                                softros-messenger


                                solarwinds


                                soundcloud


                                sourceforge


                                spark-im


                                ss7-map


                                stocktwits


                                storify


                                subversion


                                surveymonkey


                                syncplicity


                                tableau


                                teamdrive


                                teamup-calendar


                                teamviewer


                                thwapr


                                torch-browser


                                trello


                                tumblr


                                twitter


                                uc-yun


                                viber


                                vimeo


                                vine


                                virustotal


                                vkontakte


                                vnc


                                watchdox


                                webex


                                wechat


                                weiyun


                                whatsapp


                                windows-azure


                                windows-defender-atp


                                workday


                                yahoo-im


                                yammer


                                youku


                                yousendit


                                youtube


                                yunpan360


                                yy-voice


                                zalo


                                zendesk


                                zenefits


                                zettahost

07-23 10:05