Application idempotence is a crucial consideration in distributed system design. Without additional considerations for idempotence, the same message may be consumed repeatedly when a business processing failure occurs, leading to unexpected business outcomes. To avoid the exception, message queue consumers should perform idempotent processing on messages based on unique keys in the business.
What Is Message Idempotence
Definition
When the result of consuming a message multiple times in a business is the same as consuming it once, and multiple consumptions of the same message do not have any adverse impact on the business system, the processing process of this consumer is considered idempotent.
Scenario Examples
For example, in the scenario of a bank payment system, when the consumer consumes a deduction message, the system deducts CNY 1 for the order. If the deduction message is consumed again due to reasons, such as unstable network but only one deduction is made, that is, only CNY 1 is deducted in the final business result, and only one deduction flow exists for the order in the user's deduction records, without multiple deductions, this deduction operation meets requirements, and the entire consumption process achieves idempotence.
Scenarios
Message Duplication Caused by Message Sending
After a producer sends a message, the server successfully receives and persists the message. If exceptions, such as network interruption or client restarts, occur at this point, causing the server to fail to respond to the client, the producer attempts to resend the message because it does not receive an acknowledgment message from the server. As a result, the consumer receives two messages with the same content but different message IDs.
Message Duplication Caused by Message Consumption
If a network exception occurs when a consumer returns an acknowledgment message for a consumed message after business processing, the consumer will consume the processed message when the consumer attempts to consume messages again. The consumer receives two messages with the same content and message ID.
Handling Solutions
Based on the preceding two scenarios, message duplication can result in the following two situations:
Messages with different message IDs may have the same content.
Different messages have the same message ID and content.
Therefore, it is not recommended that the message ID be used as the basis for idempotent processing. It is recommended that a unique business identifier be used as the basis for idempotent processing. For example, in a payment scenario, the order number can be used as the basis for idempotent processing. After a message is consumed, the business determines whether a message is processed based on the order number.
Sample Code
public static class Order {
public String orderId;
public String orderData;
}
Producers
Producer<Order> producer = client.newProducer(Schema.AVRO(Order.class)).create();
producer.newMessage().value(new Order("orderid-12345678", "orderData")).send();
Consumers
Consumer<Order> consumer = client.newConsumer(Schema.AVRO(Order.class)).subscribe();
Order order = consumer.receive().getValue();
String key = order.orderId;
After the unique business identifier orderId is obtained, message deduplication is performed based on it.
Common Deduplication Methods
Using a Database for Deduplication
For business-level idempotent operations, you can add a filtering database. For example, you can set a deduplication table, or you can perform deduplication using a unique index in the database.
For example, if you want to write an order log table in the database based on the order transfer messages, you can use the order ID and modification timestamp as a unique index constraint.
When a consumer consumes messages with the same content, the consumer attempts to write them to the order log table. Due to the unique index, except for the first entry, subsequent attempts will fail. This ensures idempotence at the business level, meaning that even if a message is consumed multiple times, it will not affect the final data result.
Setting a Globally Unique Message ID or Task ID
The call chain ID can also be applied here. When messages are produced, a unique ID is added to each message. After a message is consumed, a key is set in the cache as the unique ID, indicating that the data has been consumed. When the consumer consumes messages, it can determine whether a message has been processed based on this record.