Introducing Confluent Private Cloud: Cloud-Level Agility for Your Private Infrastructure | Learn More
This Apache Kafka® cheat sheet is a quick reference for developers and platform engineers who need to manage Kafka clusters. It's designed to provide a concise, actionable guide to common CLI commands and configurations, helping you get tasks done quickly and efficiently.
With Confluent Cloud, you can stream data, scale Kafka, and build real-time applications faster and more cost-effectively than ever. See for yourself—test out this guide when you get started for free with elastic autoscaling Kafka clusters on the Confluent data streaming platform.
Kafka is a powerful distributed streaming platform used for building real-time data pipelines and streaming applications. If you're new to the platform, you might want to learn Kafka with some helpful Kafka tutorials to get a better understanding. For more in-depth information on what makes it so effective, check out the “What Is Apache Kafka” page.
This cheat sheet is structured to help you quickly find the information you need, whether you're performing a quick operational task or fine-tuning your application. It’s broken down into the following key sections:
Kafka CLI Commands Reference: This is your go-to section for day-to-day cluster administration. You'll find commands for creating, deleting, and listing topics, managing consumer groups, and inspecting broker metadata. This is useful for hands-on management and automation scripting.
Kafka Concepts Quick Reference: If you need a quick refresher on core terminology, this section provides a glossary of key concepts such as partitions, offsets, replication factor, and consumer groups. It's perfect for when you're explaining Kafka to a new team member or clarifying a detail during a discussion.
Producer and Consumer Configuration Cheat Sheet: Here you'll find essential configurations for your client applications. This section is for developers who need to optimize their producers for throughput or their consumers for reliability, including settings for acks, batching, and retries.
Broker and Topic Configuration Tips: This part of the cheat sheet focuses on server-side configurations. It's designed for platform engineers who are responsible for setting up or fine-tuning the Kafka cluster itself, covering important settings like log retention and partition counts.
Kafka Monitoring and Troubleshooting Commands: When things go wrong, this section provides commands and tips to diagnose issues. Use them to check broker health, consumer group lag, and other metrics that are crucial for maintaining a healthy cluster.
To help you get the most out of this cheat sheet and truly master Kafka, we recommend that you explore the extensive resources available on Confluent Developer, which offer excellent introductory courses, language guides, and Kafka and Flink tutorials.
This section provides a quick reference for the most common command-line tools used to manage a Kafka cluster.
kafka-topics.sh
This is the primary tool for managing Kafka topics. You can use it to create, alter, list, and delete topics.
Create a topic:
kafka-topics.sh --bootstrap-server <broker_ip>:<port> --create --topic <topic_name>
--partitions <num_partitions> --replication-factor <replication_factor>
Example:
kafka-topics.sh --bootstrap-server localhost:9092 --create --topic my_new_topic
--partitions 3 --replication-factor 1
List all topics:
kafka-topics.sh --bootstrap-server <broker_ip>:<port> --list
Example:
kafka-topics.sh --bootstrap-server localhost:9092 --list
Describe a topic:
This command provides detailed information about a topic, including its partitions, leader, and replica assignments.
kafka-topics.sh --bootstrap-server <broker_ip>:<port> --describe --topic <topic_name>
Example:
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic my_topic
Delete a topic:
kafka-topics.sh --bootstrap-server <broker_ip>:<port> --delete --topic <topic_name>
Example:
kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic my_topic
Below are simple, command-line clients for sending and receiving messages. They’re are invaluable for quick testing and debugging.
kafka-console-producer.sh & kafka-console-consumer.sh
Produce messages to a topic:
kafka-console-producer.sh --bootstrap-server <broker_ip>:<port> --topic <topic_name>
Example:
kafka-console-producer.sh --bootstrap-server localhost:9092 --topic --topic my_topic
After running the command, you can type a message and press Enter to send it.
Consume messages from a topic:
From the beginning:
kafka-console-consumer.sh --bootstrap-server <broker_ip>:<port> --topic <topic_name>
--from-beginning
Example:
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my_topic
--from-beginning
From a specific consumer group:
kafka-console-consumer.sh --bootstrap-server <broker_ip>:<port> --topic <topic_name>
--group <consumer_group_name>
Example:
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my_topic
--group my_consumer_group1
These commands are used for more advanced administrative tasks, such as managing configurations, security, and rebalancing partitions.
kafka-configs.sh: Manage topic and broker configurations:
kafka-configs.sh --bootstrap-server <broker_ip>:<port> --entity-type <type> --describe
Example:
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --describe
kafka-acls.sh: Manage access control lists (ACLs) to secure your cluster:
kafka-acls.sh --bootstrap-server <broker_ip>:<port> --list --topic <topic_name>
Example:
kafka-acls.sh --bootstrap-server localhost:9092 --list --topic my_secured_topic
For more about security, see our guide on role-based access control (RBAC).
kafka-reassign-paritions.sh: Manually reassign partitions for cluster balancing:
kafka-reassign-partitions.sh --bootstrap-server <broker_ip>:<port> --generate --topics-to-move-json-file reassignment.json --broker-list "1,2,3"
Examples:
Creating the input JSON file (e.g., reassignment.json):
{
"version": 1,
"partitions": [
{
"topic": "my_topic",
"partition": 0,
"replicas":
},
{
"topic": "my_topic",
"partition": 1,
"replicas":
}
]
}
Running reassignment command with above input file:
kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --generate
--topics-to-move-json-file reassignment.json --broker-list "1,2,3"
For more in-depth usage or edge cases, refer to the full syntax in the Kafka CLI docs.
This section provides a brief overview of the core concepts in Kafka. Understanding these terms is essential for effective development and administration.
Brokers are the fundamental servers that make up a Kafka cluster. They’re responsible for storing topic partitions and handling requests from producers and consumers. Developers interact with brokers by connecting to them to produce or consume messages.
A topic is a category or feed name to which messages are published. Topics are divided into a number of partitions, which are the fundamental unit of parallelism in Kafka. Developers publish messages to specific topics, and their client applications consume data from those topics.
Producers are client applications that publish (write) messages to Kafka topics. Consumers are client applications that subscribe to (read) and process messages from topics. Producers and consumers are the main components that drive the data flow in a Kafka ecosystem.
Replication is the process of copying topic partitions across multiple brokers for fault tolerance. This ensures that if a broker fails, the data is still available from another broker in the cluster. Developers can specify the replication factor when creating a topic to control data durability.
In older versions of Kafka, ZooKeeper was used to manage the cluster's metadata, such as broker information and topic configurations. Modern versions of Kafka have adopted KRaft (Kafka Raft) mode, which moves this metadata management directly into Kafka itself. This simplifies the architecture and makes the cluster more scalable and resilient.
For a deeper dive into how Kafka compares to other messaging systems, check out our guide on Kafka vs other messaging systems.
Confluent streamlines this architecture, making it easier to leverage and scale with managed services and advanced tooling.
Configuring your Kafka producers and consumers correctly is essential for achieving desired performance, latency, and data integrity. This section highlights the key settings you'll need to tune for your specific use case. For more in-depth guidance about these settings, refer to our blog posts about Kafka performance tuning.
|
Config |
Default |
Recommended |
Use Case |
|
acks |
1 |
all or 0 |
Controls the durability of messages. Use all for high durability (message is written to all in-sync replicas) and 0 for low latency (no waiting for acks). |
|
compression.type |
None |
snappy or lz4 |
Determines the compression algorithm used for batching records. Used to reduce network bandwidth and storage, especially for large messages. |
|
batch.size |
16384 |
Varies |
The maximum size in bytes of a batch of messages to send. Adjust this to balance latency and throughput. |
|
linger.ms |
0 |
5 or 10 |
The time the producer will wait before sending a batch of messages. Use this to trade a small amount of latency for higher throughput. |
Misconfiguration example: Setting acks to 0 and a low batch.size might result in high throughput, but you risk losing messages if the leader broker fails immediately after receiving a message.
|
Config |
Default |
Recommended |
Use Case |
|
auto.offset.reset |
latest |
earliest or latest |
Dictates what to do when there is no initial offset in Kafka or the current offset is no longer valid. Set to earliest to process all data from the beginning of the topic. |
|
group.id |
"" |
Required |
A string that uniquely identifies the consumer group. This is crucial for enabling consumer group functionality. |
|
enable.auto.commit |
true |
false |
Controls whether offsets are automatically committed. Set to false for manual control over offset commits, which is safer for many applications. |
Misconfiguration example: If a consumer is configured with enable.auto.commit=true and a processing error occurs before the auto-commit interval, it's possible to lose messages or process them twice after a rebalance.
Understanding the interplay between acks, batch.size, and linger.ms is vital for optimizing Kafka producer performance. These configuration settings directly impact the balance between message availability and durability within your Kafka cluster.
This Kafka broker tuning cheat sheet provides an overview of configurations that affect the scalability and resilience of your entire cluster. Knowing when and why to adjust these settings is key for platform engineers. To understand how these configurations fit into the broader system, consider reviewing the core principles of Kafka architecture.
num.partitions:
Default: 1. This is the default number of partitions for new topics.
Increasing this value can increase parallelism and throughput for a topic, but it cannot be easily decreased.
log.retention.hours:
Default: 168 (7 days). This is the default time in hours to retain log segments before they’re deleted.
Setting this lower saves disk space, but setting it higher is important for long-term data retention or reprocessing.
log.segment.bytes:
Default: 1073741824 (1 GB). This is the maximum size of a single log segment file.
Larger segments can reduce the number of files and open file handles, but smaller segments can make log cleanup faster.
retention.ms:
Default: null (inherits from broker). This is the time in milliseconds to retain messages for a specific topic.
This is a critical setting for managing data life cycle and storage costs.
cleanup.policy:
Default: delete. This policy determines what happens to log segments when retention time or size is reached. Choosing delete removes old segments, while compact retains the latest message for each key.
Choosing the right policy is vital for different use cases, such as event sourcing versus database change logs.
min.insync.replicas:
Default: 1. This is the minimum number of replicas that must acknowledge a write for it to be considered successful when acks=all.
Increasing this value improves durability and availability, but it can also increase latency and reduce throughput if a broker is slow.
When configuring your Kafka cluster, the min.insync.replicas setting is a critical factor in ensuring data durability and availability. This configuration directly influences the trade-offs between resiliency and potential latency in your Kafka deployment.
To optimize the performance and scalability of your Kafka cluster, the num.partitions setting is a key consideration. This configuration directly impacts the degree of parallelism and throughput your topics can achieve.
Understanding how to monitor your Kafka cluster and diagnose problems is essential for maintaining a healthy and performant system. The following commands and tips will help you quickly check the status of your cluster and identify common issues. For comprehensive guidance, refer to the dedicated documentation on Kafka monitoring.
Consumer lag is a key metric that tells you how far behind a consumer group is from the latest message in a topic.
To check the lag for all consumer groups, use this command:
kafka-consumer-groups.sh --bootstrap-server <broker_ip>:<port> --list
To get a detailed description of a specific consumer group, including the lag for each partition:
kafka-consumer-groups.sh --bootstrap-server <broker_ip>:<port> --describe --group <group_name>
Here's a sample output structure and what the key fields represent:
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID mycg1
my_topic 0 1234 1240 6 consumer1 host1 client1
Explanation of the Fields
GROUP: The name of the consumer group being described (my_consumer_group).
TOPIC: The topic the consumer group is consuming from (my_topic, another_topic).
PARTITION: The specific partition of the topic (0, 1). A topic is divided into partitions, and consumer groups consume from one or more partitions.
CURRENT-OFFSET: The last offset successfully committed by the consumer group for that partition.
LOG-END-OFFSET: The latest offset available in the partition's log.
LAG: The difference between LOG-END-OFFSET and CURRENT-OFFSET. This is the consumer lag for that specific partition, indicating how many messages the consumer group is behind.
CONSUMER-ID: A unique identifier for the consumer instance within the group.
HOST: The host where the consumer instance is running.
CLIENT-ID: The client ID configured for the consumer instance.
Logs:
The primary source for troubleshooting is the Kafka broker log files (server.log), which are typically located in the logs directory of your Kafka installation. These logs contain information about broker startup, client connections, errors, and warnings.
Example:
To view the Kafka broker logs, you would typically tail the server.log file, which is usually located in the logs directory within your Kafka installation.
tail -f /path/to/kafka/logs/server.log
Metrics:
Checking metrics in Kafka often involves using Java Management Extensions (JMX) or dedicated monitoring tools. For the command line, Kafka provides some tools to get basic insights, particularly for consumer lag.
Consumer lag was discussed previously; kafka-consumer-groups.sh is your main tool for checking consumer lag.
Key metrics to monitor include:
Latency: The time it takes for a message to travel from a producer to a broker or from a broker to a consumer.
Consumer Lag: The number of messages a consumer group is behind.
In-Sync Replicas (ISR): The number of replicas that are fully synchronized with the leader partition. A low ISR count indicates potential data loss or availability issues.
Example of checking ISR:
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic my_topic
Expected output example:
Topic: my_topic Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2
Topic: my_topic Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0
Below are answers to questions about the top five issues. It outlines some frequently encountered problems, their likely causes, and practical steps for diagnosis and resolution, drawing from the common error messages and metrics discussed.
|
Error Message |
Common Cause |
How to Diagnose and Fix |
|
OffsetOutOfRangeException |
A consumer is trying to read from an offset that no longer exists in a topic partition. This often happens if the consumer has been offline for longer than the topic's retention period. |
Use the kafka-consumer-groups.sh command to reset the consumer's offset to the earliest or latest available message. |
|
BrokerNotAvailable |
A client (producer or consumer) is unable to connect to a broker. This could be due to a broker being down or a network issue. |
Check if the broker process is running and if the client's bootstrap.servers configuration is correct. Verify network connectivity using telnet or ping. |
|
NotLeaderForPartitionException |
A producer or consumer is trying to interact with a partition on a broker that is no longer the leader. |
This is usually a temporary issue that resolves during a leader re-election. If it persists, check the health of the cluster and broker logs. |
|
RecordTooLargeException |
A producer is attempting to send a message that exceeds the maximum size configured on the broker or producer. |
Increase the message.max.bytes configuration on the broker and the max.request.size on the producer. |
|
RequestTimedOut |
A request from a client to a broker took longer than the configured timeout period. This can be caused by network issues or an overloaded broker. |
Increase the request.timeout.ms setting on the client. Monitor broker metrics such as CPU and network usage to identify bottlenecks. |
For a robust monitoring solution, consider using a managed service such as Confluent Cloud, which provides built-in dashboards and alerting for these metrics.
Get Started Free