本文介绍了编写 Twisted Client 以向多个 API 调用发送循环 GET 请求并记录响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经有一段时间没有做过扭曲的编程了,所以我正在努力重新开始一个新项目.我正在尝试设置一个可以将服务器列表作为参数的扭曲客户端,并且对于每个服务器,它都会发送一个 API GET 调用并将返回消息写入文件.此 API GET 调用应每 60 秒重复一次.

I haven't done twisted programming in a while so I'm trying to get back into it for a new project. I'm attempting to set up a twisted client that can take a list of servers as an argument, and for each server it sends an API GET call and writes the return message to a file. This API GET call should be repeated every 60 seconds.

我已经使用 Twisted 的代理类在单个服务器上成功完成了:

I've done it successfully with a single server using Twisted's agent class:

from StringIO import StringIO

from twisted.internet import reactor
from twisted.internet.protocol import Protocol
from twisted.web.client import Agent
from twisted.web.http_headers import Headers
from twisted.internet.defer import Deferred

import datetime
from datetime import timedelta
import time

count = 1
filename = "test.csv"

class server_response(Protocol):
    def __init__(self, finished):
        print "init server response"
        self.finished = finished
        self.remaining = 1024 * 10

    def dataReceived(self, bytes):
        if self.remaining:
            display = bytes[:self.remaining]
            print 'Some data received:'
            print display
            with open(filename, "a") as myfile:
                myfile.write(display)

            self.remaining -= len(display)


    def connectionLost(self, reason):
        print 'Finished receiving body:', reason.getErrorMessage()

        self.finished.callback(None)

def capture_response(response):
    print "Capturing response"
    finished = Deferred()
    response.deliverBody(server_response(finished))
    print "Done capturing:", finished

    return finished

def responseFail(err):
    print "error" + err
    reactor.stop()


def cl(ignored):
    print "sending req"
    agent = Agent(reactor)
    headers = {
    'authorization': [<snipped>],
    'cache-control': [<snipped>],
    'postman-token': [<snipped>]
    }

    URL = <snipped>
    print URL

    a = agent.request(
        'GET',
        URL,
        Headers(headers),
        None)

    a.addCallback(capture_response)
    reactor.callLater(60, cl, None)
    #a.addBoth(cbShutdown, count)


def cbShutdown(ignored, count):
    print "reactor stop"
    reactor.stop()

def parse_args():
    usage = """usage: %prog [options] [hostname]:port ...
    Run it like this:
      python test.py hostname1:instanceName1 hostname2:instancename2 ...
    """

    parser = optparse.OptionParser(usage)

    _, addresses = parser.parse_args()

    if not addresses:
        print parser.format_help()
        parser.exit()

    def parse_address(addr):
        if ':' not in addr:
            hostName = '127.0.0.1'
            instanceName = addr
        else:
            hostName, instanceName = addr.split(':', 1)

        return hostName, instanceName

    return map(parse_address, addresses)

if __name__ == '__main__':
    d = Deferred()
    d.addCallbacks(cl, responseFail)
    reactor.callWhenRunning(d.callback, None)

    reactor.run()

但是,我很难弄清楚如何让多个代理发送呼叫.有了这个,我依靠 cl() 中的写入结束 ---reactor.callLater(60, cl, None) 来创建调用循环.那么,如何创建多个呼叫代理协议(server_response(Protocol))并在我的反应器启动后继续为每个调用代理协议循环 GET?

However I'm having a tough time figuring out how to have multiple agents sending calls. With this, I'm relying on the end of the write in cl() ---reactor.callLater(60, cl, None) to create the call loop. So how do I create multiple call agent protocols (server_response(Protocol)) and continue to loop through the GET for each of them once my reactor is started?

推荐答案

看看猫拖进来了什么!

那么如何创建多个呼叫代理

使用 treq.您很少想与 Agent 类纠缠在一起.

Use treq. You rarely want to get tangled up with the Agent class.

此 API GET 调用应每 60 秒重复一次

使用 LoopingCalls 而不是callLater,在这种情况下更容易,以后遇到的问题也会更少.

Use LoopingCalls instead of callLater, in this case it's easier and you'll run into less problems later.

import treq
from twisted.internet import task, reactor

filename = 'test.csv'

def writeToFile(content):
    with open(filename, 'ab') as f:
        f.write(content)

def everyMinute(*urls):
    for url in urls:
        d = treq.get(url)
        d.addCallback(treq.content)
        d.addCallback(writeToFile)

#----- Main -----#
sites = [
    'https://www.google.com',
    'https://www.amazon.com',
    'https://www.facebook.com']

repeating = task.LoopingCall(everyMinute, *sites)
repeating.start(60)

reactor.run()

它从 everyMinute() 函数开始,该函数每 60 秒运行一次.在该函数中,查询每个端点,一旦响应的内容可用,treq.content 函数将获取响应并返回内容.最后将内容写入文件.

It starts in the everyMinute() function, which runs every 60 seconds. Within that function, each endpoint is queried and once the contents of the response becomes available, the treq.content function takes the response and returns the contents. Finally the contents are written to a file.

附注

您是在抓取还是试图从这些网站中提取某些内容?如果您是 scrapy 可能是您的不错选择.

Are you scraping or trying to extract something from those sites? If you are scrapy might be a good option for you.

这篇关于编写 Twisted Client 以向多个 API 调用发送循环 GET 请求并记录响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 20:49