本文介绍了Google Cloud Pub-Sub的Avro讯息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是发布到Pub-Sub和从Pub-Sub消费的最佳数据格式?我正在查看Avro消息格式,因为它是二进制格式.用例是会有实时的微服务应用程序将Avro消息发布到pub-sub.考虑到avro消息最适合在批处理消息(以及二进制消息附带的架构)然后发布消息时使用,对于涉及微服务的这种用例来说,这是否是一种更合适的格式?

What is a best data format for publishing and consuming to/from Pub-Sub? I am looking at Avro message format due to it's binary format.Usecases are there would be real time Microservice applications publishing Avro messages to pub-sub. Given that avro message is best suited when batching up messages(along with a schema attached with the binary message) and then publishing the messages, would that be a better suitable format for this usecase involving microservice?

推荐答案

对于在所有用例中用于消息的最佳格式,将没有一个正确的答案. Avro当然是一个受欢迎的选择. 协议缓冲区是另一种可能,节俭.对于发布/订阅,数据全都是字节,并且由发布者和订阅者确定此数据的解释.人们对不同的数据格式进行了比较,因此您可能需要根据您对性能和邮件大小的需求做出决定.

There isn't going to be one correct answer for the best format to use for the messages for all use cases. Avro is certainly a popular choice. Protocol buffers would be another possibility, as would Thrift. For Pub/Sub, the data is all just bytes and it is up to the publisher and the subscriber to determine the interpretation of this data. People have run comparisons on the different data formats, so you may want to make the decision based on your needs in terms of performance and message sizes.

Pub/Sub本身将协议缓冲区用于定义其数据类型.关于批处理, Cloud Pub/Sub客户端库会为自己进行批处理发布,因此您不必自己担心.您可以通过调用例如 setBatchSettings > for Java(其他语言也有等效语言).如果要将某些元数据与一组消息而不是与每个单独的消息相关联,或者在如何将消息一起批处理方面有非常特定的需求,则可以决定自己进行批处理.否则,取决于客户端库来进行批处理可能是正确的决定.

Pub/Sub itself uses Protocol buffers for defining its data types. With regard to batching, the Cloud Pub/Sub client libraries do batching themselves for publish, so you don't necessarily have to worry about that on your own. You can control the batch settings to optimize throughput and latency based on your use case by calling, for example, setBatchSettings in the Publisher.Builder for Java (other languages have an equivalent as well). You may decide to do your own batching if you want to associate some metadata with a set of messages instead of with each individual message or you have very specific needs in terms of how messages are batched together. Otherwise, depending on the client library to do the batching is probably the correct decision.

这篇关于Google Cloud Pub-Sub的Avro讯息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-23 19:53