What is Kafka Retention Period?

Written by Lovisa Johansson

Frequently Asked Apache Kafka Question: What is Apache Kafka Retention Period? This article explains what the Apache Kafka Retention Period is and how it can be adjusted.

A message sent to a Kafka cluster is appended to the end of one of the logs. The message remains in the topic for a configurable period of time or until a configurable size is reached until the specified retention for the topic exceeds. The message stays in the log, even if the message has been consumed.

If the log retention is set to five days, then the published message is available for consumption five days after the publish. The message will be de discarded and free up space after five days. The performance in Kafka is not affected by the data size of messages, so retaining lots of data is not a problem.

CloudKarafka allows users to configure the retention period on a per-topic basis. The time or size can be specified via the Kafka management interface for dedicated plans or via the topics tab for the plan Developer Duck.

log.retention.hours

log.retention.hours define the time a message is stored on a topic, before it discards old log segments to free up space.

CloudKarafka default: log.retention.hours=168

log.retention.bytes

log.retention.bytes is a size-based retention policy for logs, i.e the allowed size of the topic. Segments are pruned from the log as long as the remaining segments don't drop below log.retention.bytes.

CloudKarafka default: log.retention.bytes=1073741824

The shared plan Developer Duck might have other limitations.

CloudKarafka - Industry Leading Apache Kafka as a Service