问题描述
我创建了一个带有 Azure 存储队列触发器的 Azure 函数应用程序,该触发器处理一个队列,其中每个队列项都是一个 URL.该函数只是下载 URL 的内容.我还有另一个函数可以加载和解析站点的 XML 站点地图并将所有页面 URL 添加到队列中.我遇到的问题是 Functions 应用程序运行速度太快,它重创了网站,因此它开始返回服务器错误.有没有办法限制/限制 Functions 应用的运行速度?
I have created an Azure Function app with an Azure Storage Queue trigger that processes a queue in which each queue item is a URL. The Function just downloads the content of the URL. I have another function that loads and parses a site's XML Sitemap and adds all the page URLs to the queue. The problem I have is that the Functions app runs too quickly and it hammers the website so it starts returning Server Errors. Is there a way to limit/throttle the speed at which the Functions app runs?
当然,我可以编写一个简单的 Web 作业来串行处理它们(或使用一些异步但限制并发请求的数量),但我真的很喜欢 Azure Functions 的简单性,并想尝试无服务器"计算.
I could, of course, write a simple web job that processed them serially (or with some async but limit the number of concurrent requests), but I really like the simplicity of Azure Functions and wanted to try out "serverless" computing.
推荐答案
您可以考虑几个选项.
首先,您可以在 host.json
中配置一些控制队列处理的旋钮(记录在 此处).queues.batchSize
旋钮是一次获取多少队列消息.如果设置为 1,则运行时将一次获取 1 条消息,并且仅在该消息的处理完成后才获取下一条消息.这可以为您提供单个实例的某种程度的序列化.
First, there are some knobs that you can configure in host.json
that control queue processing (documented here). The queues.batchSize
knob is how many queue messages are fetched at a time. If set to 1, the runtime would fetch 1 message at a time, and only fetch the next when processing for that message is complete. This could give you some level of serialization on a single instance.
另一个选项可能是您在入队的消息上设置 NextVisibleTime,以使它们间隔开 - 默认情况下,入队的消息变得可见并准备好立即处理.
Another option might be for you to set the NextVisibleTime on the messages you enqueue in such a way that they are spaced out - by default messages that are enqueued become visible and ready for processing immediately.
最后一个选项可能是您将一个包含站点所有 URL 的集合的消息加入队列,而不是一次一个,因此在处理消息时,您可以在函数中按顺序处理这些 URL,并且以这种方式限制并行度.
A final option might be be for you to enqueue a message with the collection of all URLs for a site, rather than one at a time, so when the message is processed, you can process the URLs serially in your function, and limit the parallelism that way.
这篇关于在 Azure Function App 中限制 Azure 存储队列处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!