Apache Kafka is an open-source and distributed event streaming platform that is used for building real-time data pipelines and streaming applications. Apache Kafka was originally developed by LinkedIn and was later acquired and open-sourced by the Apache Software Foundation, Kafka is designed to handle high-throughput, low-latency data feeds reliably and scalability.
What is Apache Kafka?
At its core, Apache Kafka is a publish and subscribe messaging system, where you can send messages (events) to topics and consumers read the messages in real-time. However, Kafka will go beyond the traditional messaging systems by offering the following:
- Scalability: Kafka servers can scale horizontally by adding additional brokers and partitions.
- Durability: Messages that are stored on disk and replicated across multiple brokers ensure persistence and fault tolerance.
- High Throughput: Kafka is capable of handling millions of messages per second with minimal latency.
- Fault Tolerance: Kafka clusters can survive server failures with no data loss.
Kafka is commonly used for:
- Log aggregation
- Real-time analytics
- Event sourcing
- Data integration between systems
- Monitoring and alerting systems
Key Concepts:
- Topic: A category or feed name to which records are published.
- Producer: An application that sends data to Kafka topics.
- Consumer: An application that reads data from Kafka topics.
- Broker: A Kafka server that stores and serves data.
- Partition: A topic is split into partitions to allow parallel processing.
- Zookeeper: Used for cluster coordination (though being phased out in favor of Kafka’s native Raft-based solution).
Apache Kafka Tutorial for Beginners
Downloading Apache Kafka is a lengthy but super simple process.
Prerequisites to download Apache Kafka
Before you start downloading Apache Kafka, you should ideally fulfil these prerequisites.
Get exclusive access to all things tech-savvy, and be the first to receive
the latest updates directly in your inbox.
- Java (JDK 8 or Higher)
Apache Kafka is written in Java, so you need a compatible version of the Java Development Kit (JDK), ideally 8 or later.
- Install Java on Linux (Ubuntu/Debian):
sudo apt update
sudo apt install default-jdk -y
- Install Java on macOS (via Homebrew):
brew install openjdk@11
- Verify Java Installation:
java -version
Make sure to set the JAVA_HOME environment variable to point to your Java installation (this can vary depending on your OS).
- Apache ZooKeeper (Required for Kafka Versions Below 2.8)
Apache Kafka uses ZooKeeper to manage broker metadata and cluster coordination. For versions prior to Kafka 2.8, you need to have ZooKeeper installed and running. However, later versions also support KRaft mode.
- Install ZooKeeper on Linux (Ubuntu/Debian):
sudo apt install zookeeperd - Start ZooKeeper:
sudo systemctl start zookeeper - Check ZooKeeper status:
sudo systemctl status zookeeper
If you’re using Kafka 2.8 or newer with KRaft mode, you can skip this step since ZooKeeper is no longer needed.
- Sufficient Disk Space
Apache Kafka stores data on the disk, so ensure that you have enough disk space for the Kafka logs. The amount that is required will depend on the volume of data that you intend to handle. Generally, Kafka needs at least a few gigabytes of storage space. For example for small scale deployments, 50 to 100 GB is sufficient.
- Network Configuration
Kafka uses ports for communication between clients and brokers:

- Port 9092: Default communication port for Kafka brokers.
- Port 2181: Default port for ZooKeeper (if using Kafka versions prior to 2.8).
Ensure that these ports are open and not blocked by a firewall.
- Operating System
Supporting operating systems for Unix-based systems include Linux/ macOS and Windows Subsystem for Linux (WSL) or Docker.
- System Requirements
For basic use cases, the following resources are usually sufficient:
- CPU: At least 2 cores (4+ recommended for production)
- RAM: 8GB or more is preferred for optimal performance
- Disk I/O: Kafka performs better with high disk throughput. SSDs are recommended in production.
How to Download and Install Apache Kafka
Step 1: Install Java
Apache Kafka requires Java 8 or later to run. First, check if Java is already installed:
java -version
If Java is not installed, you can install the default JDK:
sudo apt update
sudo apt install default-jdk -y
After installation, verify:
java -version
Set the JAVA_HOME environment variable (optional but recommended):
echo “export JAVA_HOME=$(readlink -f /usr/bin/java | sed ‘s:/bin/java::’)” >> ~/.bashrc
source ~/.bashrc
Step 2: Download Apache Kafka
Visit the official Kafka website: https://kafka.apache.org/downloads
Select the latest binary version for your Scala version. Then download it using wget:
wget https://downloads.apache.org/kafka/3.6.0/kafka_2.13-3.6.0.tgz
Extract the tarball:
tar -xzf kafka_2.13-3.6.0.tgz
cd kafka_2.13-3.6.0
Step 3: Start ZooKeeper
Start ZooKeeper using the bundled configuration:
bin/zookeeper-server-start.sh config/zookeeper.properties
Leave this terminal running or use a terminal multiplexer like tmux or screen.
Step 4: Start the Kafka Server
Open a new terminal window and navigate to the Kafka directory. Start the Kafka broker:
bin/kafka-server-start.sh config/server.properties
Kafka should now be running and listening on port 9092.
Step 5: Create a Kafka Topic
In another terminal, create a topic using the Kafka CLI:
bin/kafka-topics.sh –create \
–topic my-first-topic \
–bootstrap-server localhost:9092 \
–partitions 1 \
–replication-factor 1
To verify the topic was created:
bin/kafka-topics.sh –list –bootstrap-server localhost:9092
Step 6: Produce Messages (Producer)
Start a producer to send messages to the topic:
bin/kafka-console-producer.sh –topic my-first-topic –bootstrap-server localhost:9092
Type your messages and press Enter. These messages will be sent to Kafka.
Step 7: Consume Messages (Consumer)
In another terminal, start a consumer to read the messages:
bin/kafka-console-consumer.sh –topic my-first-topic –from-beginning –bootstrap-server localhost:9092
You will see the messages typed into the producer console appear here.
Related Article: Apache Maven Download Guide [2025]: Best Way To Set It Up in Under 5 Minutes
Kafka Apache Common Use Cases
Some common use cases of Apache Kafka include real-time analytics and tracking. However, it is used in multiple more scenarios as well.
- Log Aggregation
Apache Kafka is used to collect and centralize logs from various different services and servers. It helps keep the data in a standard manner throughout for analytics and storage systems.
- Real-Time Analytics
Businesses use Apache Kafka to feed real-time data into their dedicated analytics tool. This helps with immediate insights, such as detecting fraud, customer behaviour, or performance monitoring.
- Event Sourcing
Apache Kafka stores all the events in chronological manner, making it an ideal scenario for event-sourced architures. There it is super useful for applications that need to replay events.
- Data Integration / ETL Pipelines
Apache Kafka also acts as a central data server that can connect multiple data sources like APIs, apps, and more with sinks like warehouses and cloud services.
- Microservices Communication
Apache Kafka decouples microservices by enabling asynchronous communication, which improves reliability, fault isolation, and scalability.
- IoT Data Streaming
Kafka can handle huge loads of streaming data from IoT devices, sensors, and machines. It helps you process real-time telemetry for industries like manufacturing, agriculture, and smart cities.
- Website Activity Tracking
A lot of big websites, such as LinkedIn or Netflix use Apache Kafka to track user activity such as clicks, searches, views, or interactions. This helps with personalization, recommendations, and A/B testing.
Wrapping Up – Apache Kafka
Apache Kafka is an excellent tool for multiple uses. However, using it smartly to maximize its efficiency is important. This guide will help you install, download, and implement it to the best of its capabilities.
Is Apache Kafka free to use?
Yes, Apache Kafka is open-source and free to use under the Apache 2.0 license.
What programming languages support Kafka?
Kafka supports multiple languages including Java (official client), Python, Go, C++, and more through community and official clients.
Is Apache Kafka suitable for beginners?
While Kafka has a learning curve, beginners can start with simple use cases like log aggregation or stream processing by following well-documented tutorials.