Data compression can reduce network I/O transmission traffic and disk usage. This document describes the message formats supported for data compression and how to configure it based on your actual needs.
Currently, Kafka supports two message formats, i.e., v1 and v2 (imported in v0.11.0.0). CKafka supports the open-source versions 0.9 and 0.10.2, and exclusive CKafka clusters support v1.1.1.
Different configurations apply to different versions, which are described as below:
Performance of a compression algorithm is evaluated mainly based on two metrics: compression ratio and compression/decompression throughput.
Versions below Kafka 2.1.0 support three compression algorithms: Gzip, Snappy, and LZ4.
In actual use of Kafka, comparison of performance metric between the three algorithms is as shown below:
Comparison of physical resource usage is as shown below:
Since bandwidth resources are more valuable than CPU and disk resources (1-Gigabit network is the standard configuration), the ranking of the three compression algorithms is LZ4 > Gzip > Snappy.
A producer can use the following method to configure data compression:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "all");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// After the producer is started, all its produced message sets will be compressed, which can greatly reduce the network transmission bandwidth and disk usage of the Kafka broker.
// Please note that different versions have different configurations. Currently, versions 0.9 and below do not support compression, and versions 0.10 and above do not support Gzip compression.
props.put("compression.type", " lz4 ");
Producer<String, String> producer = new KafkaProducer<>(props);
In most cases, after receiving a message from the producer, the broker will retain it as-is without making any modification.
compression.codec
cannot be set.InValid
.
Was this page helpful?