You can migrate data between different CKafka instances in the following steps. For more information, please see Migrating Data to CKafka.
Create a CKafka instance (for more information, please see Creating Instance).
Create a topic with the same name as that in the legacy CKafka instance (for more information, please see Creating Topic).
Switch the producer to the new CKafka instance so that the legacy instance will not receive new message data.
bootstrap.servers
of the consumer to the new instance's address for consumption. As the offset of the new instance will start from zero, you need to perform the following steps based on the actual business needs when switching the consumer group:Note:
This tool can get the offset based on the file modification time. Its accuracy is relatively low, but it is compatible with production on legacy versions (Kafka v0.9 and below).
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list xxx --topic xxx --time xxx
Here, time
is the Unix time in milliseconds, -1
indicates to pull the offset of the latest version, and -2
indicates to pull the offset of the oldest version.
./kafka-consumer-groups.sh --new-consumer --bootstrap-server xxx --reset-offsets --to-datetime xxx (such as '2019-03-05T00:00:00.000') --group xxx --topic xxx -excute
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list xxx --topic xxx --time xxx
time
is the Unix time in milliseconds, -1
indicates to pull the offset of the latest version, and -2
indicates to pull the offset of the oldest version../kafka-consumer-groups.sh --new-consumer --bootstrap-server xxx --reset-offsets --from-file xxx --group xxx -excute
file
records the topic, partition, and offset as rows in CSV format.Offset managed by broker in consumer group:
auto.offset.reset
to latest
during consumer migration (it is the default value. You do not need to change anything if this value was not set on the legacy version).Offset retained by yourself:
As the offset of the corresponding partition in the new instance will change, you need to get the corresponding offset information. You can use the tool described in 3.1 to pull the offset based on the approximate time of producer switch, update the retained offset, and restart the consumer.
Offset managed by broker in consumer group:
auto.offset.reset
to earliest
and start the consumer.Offset retained by yourself:
Note:
- You can also directly perform doublewrite (i.e., writing into the new and legacy instances at the same time) through the producer; however, such switch may consume the same piece of data repeatedly.
- A corresponding discount will be provided based on your estimated business scale.
- You are recommended to use the topic copy script of Python to migrate the instance to the cluster on the new version (a README document will be present after the script is decompressed).
- Download the aforementioned open-source script here >>
Was this page helpful?