我正在将X#条tweets存储在MongoDB中之后,希望Tweepy Streaming API停止提取tweets。
我已经在类中尝试了IF和WHILE语句,并用计数器定义,但是无法使其停止在一定的X量处。对我来说这真是个危险。我在这里找到此链接:https://groups.google.com/forum/#!topic/tweepy/5IGlu2Qiug4,但是我复制该链接的努力失败了。它总是告诉我 init 需要一个附加参数。我相信我们的Tweepy身份验证设置不同,所以不是苹果对苹果。
有什么想法吗?
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
import json, time, sys
import tweepy
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
class StdOutListener(StreamListener):
def on_status(self, status):
text = status.text
created = status.created_at
record = {'Text': text, 'Created At': created}
print record #See Tweepy documentation to learn how to access other fields
collection.insert(record)
def on_error(self, status):
print 'Error on status', status
def on_limit(self, status):
print 'Limit threshold exceeded', status
def on_timeout(self, status):
print 'Stream disconnected; continuing...'
stream = Stream(auth, StdOutListener())
stream.filter(track=['tv'])
最佳答案
您需要在类中的__init__
中添加一个计数器,然后在on_status
中将其递增。然后,当计数器低于20时,它将在集合中插入一条记录。可以如下所示进行:
def __init__(self, api=None):
super(StdOutListener, self).__init__()
self.num_tweets = 0
def on_status(self, status):
record = {'Text': status.text, 'Created At': status.created_at}
print record #See Tweepy documentation to learn how to access other fields
self.num_tweets += 1
if self.num_tweets < 20:
collection.insert(record)
return True
else:
return False
关于python - Tweepy流媒体-停止以x数量收集推文,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/20863486/