问题描述
目前我有一个脚本,它每 30 秒查询一些 subreddits 并返回最新提交:
At the moment I have a script that queries some subreddits every 30 seconds and returns the newest submission:
while True:
for post in reddit.subreddit(query_list).new(limit=1):
if previous != post:
# Do something
previous = post
time.sleep(30)
这样做的问题是,如果在该时间范围内有两个以上的帖子,它会跳过其中一个.我知道我可以设置更短的等待时间,或者我可以一次收到多个帖子并整理结果,但这并不能真正解决问题,只会降低它的可能性.
The problem with this is that if there are more than two posts in that time frame it'll skip one of them. I know I can set a smaller wait time, or I can get more than one post at a time and sort through the results, but that doesn't really fix the problem, it just makes it less likely.
我更愿意做的是通过持续打开的连接来订阅"一个提要,该连接在发布时接收帖子.这存在吗?如果没有,是否还有其他我没有想到的解决方案?
What I would much rather do, is 'subscribe' to a feed by having a continuously open connection that receives posts as they are posted. Does this exist? And if not, is there another solution I haven't thought of?
(我意识到我在说什么会给 reddit api 服务器带来很大的压力,所以它可能不存在,但我认为为了以防万一,值得一问)
(I realise what I'm talking about would put a large strain on the reddit api servers, so it probably doesn't exist, but I thought it was worth asking just in case)
推荐答案
是的,它存在于 PRAW 中,它被称为 流.您的整个代码块可以替换为以下内容:
Yes, this exists in PRAW and it's called stream. Your entire code block can be replaced with the following:
for post in reddit.subreddit(query_list).stream.submissions():
# Do something
您可以通过将 submissions
替换为 comments
来流式传输 subreddit 评论.
You can stream subreddit comments by replacing submissions
with comments
.
其他模型也可以流式传输,例如 Multireddit 和 Redditor.
Other models can be streamed as well, such as Multireddit and Redditor.
这篇关于Python praw reddit api:在发布时可靠地获取帖子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!