问题描述
我正在研究一个需要位自动化和网络剪贴的项目,为此我正在使用 Selenium 和 BeautifulSoup(python2.7).
I am working on a project which needs bit automation and web-scrapping for which I am using Selenium and BeautifulSoup (python2.7).
我只想打开一个网络浏览器实例并登录到网站,保持该会话,我正在尝试打开新的标签,这些标签将受到独立控制通过线程,每个线程控制一个标签并执行自己的任务.我该怎么办?一个示例代码会很好.好吧,这是我的代码:
I want to open only one instance of a web browser and login to a website, keeping that session, I am trying to open new tabs which will be independently controlled by threads, each thread controlling a tab and performing their own task. How should I do it? An example code would be nice. Well here's my code:
def threadFunc(driver, tabId):
if tabId == 1:
#open a new tab and do something in it
elif tabId == 2:
#open another new tab with some different link and perform some task
.... #other cases
class tabThreads(threading.Thread):
def __init__(self, driver, tabId):
threading.Thread.__init__(self)
self.tabID = tabId
self.driver = driver
def run(self):
print "Executing tab ", self.tabID
threadFunc(self.driver, self.tabID)
def func():
# Created a main window
driver = webdriver.Firefox()
driver.get("...someLink...")
# This is the part where i am stuck, whether to create threads and send
# them the same web-driver to stick with the current session by using the
# javascript call "window.open('')" or use a separate for each tab to
# operate on individual pages, but that will open a new browser instance
# everytime a driver is created
thread1 = tabThreads(driver, 1)
thread2 = tabThreads(driver, 2)
...... #other threads
如果需要,我会接受有关使用其他模块的建议
推荐答案
我的理解是Selenium驱动程序不是线程安全的.在WebDriver规范中,线程安全"部分为空...我想表示他们根本没有解决该主题. https://www.w3.org/TR/2012 /WD-webdriver-20120710/#thread-safety
My understanding is that Selenium drivers are not thread-safe. In the WebDriver spec, the Thread Safety section is empty...which I take to mean they have not addressed the topic at all. https://www.w3.org/TR/2012/WD-webdriver-20120710/#thread-safety
因此,尽管您可以与多个线程共享驱动程序引用并从多个线程对该驱动程序进行调用,但不能保证该驱动程序将能够正确处理多个异步调用.
So while you could share the driver reference with multiple threads and make calls to the driver from multiple threads, there is no guarantee that the driver will be able to handle multiple asynchronous calls correctly.
相反,您必须同步来自多个线程的调用以确保一个线程在下一个线程开始之前完成,或者您应该只有一个线程进行Selenium API调用...可能处理来自多个其他线程填充的队列中的命令
Instead, you must either synchronize calls from multiple threads to ensure one is completed before the next starts, or you should have just one thread making Selenium API calls...potentially handling commands from a queue that is filled by multiple other threads.
此外,请参见> Selenium可以在一个浏览器中使用多线程吗? /a>
Also, see Can Selenium use multi threading in one browser?
这篇关于Selenium Python中的多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!