我刚刚开始使用Spynner抓取网页,但是找不到任何好的教程。这里有一个简单的示例,我在Google中输入一个单词,然后我想查看结果页面。

但是,如何从单击按钮到实际获得新页面呢?

import spynner

def content_ready(browser):
    if 'gbqfba' in browser.html:
        return True #id of search button

b = spynner.Browser()
b.show()
b.load("http://www.google.com", wait_callback=content_ready)
b.wk_fill('input[name=q]', 'soup')
# b.browse() # Shows the word soup in the input box
with open("test.html", "w") as hf: # writes the initial page to a file
    hf.write(b.html.encode("utf-8"))
b.wk_click("#gbqfba") # Clicks the google search button (or so I think)


但是现在呢?我什至不确定我是否单击了Google搜索按钮,尽管它确实具有id = gbqfba。我也尝试过b.click(“#gbqfba”)。如何获得搜索结果?

我试着只是做:

 with open("test.html", "w") as hf: # writes the initial page to a file
    hf.write(b.html.encode("utf-8"))


但这仍然会打印初始页面。

最佳答案

我通过将Enter发送到输入并等待两秒钟来解决了这个问题。不理想,但可以



import spynner
import codecs
from PyQt4.QtCore import Qt

b = spynner.Browser()
b.show()
b.load("http://www.google.com")
b.wk_fill('input[name=q]', 'soup')
# b.browse() # Shows the word soup in the input box

b.sendKeys("input[name=q]",[Qt.Key_Enter])
b.wait(2)
codecs.open("out.html","w","utf-8").write(b.html)

关于python - Spynner:提交表单后获取第二页的html,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/15476395/

10-12 16:34