Although Apache Kafka is written in Java, there are Python Kafka clients available for use with Kafka. In this tutorial, let’s go through examples of Kafka with Python Producer and Consumer clients.
Let’s consider this a “Getting Started” tutorial.
After completing this, you will be ready to proceed to more complex examples. But we need to get started some place, so let’s get started now.
After you complete this (in under two minutes?), let me know if you’d like to cover any other topics of using Python with Kafka.
Table of Contents
- Kafka Python Overview
- Python with Kafka Producer Example
- Python Consumer Example
- Python with Kafka Demo Examples
- What are available options for Python clients with Kafka?
- What are pros and cons of using Python with Kafka?
- Closing Thoughts
Kafka Python Overview
To use Python with Kafka, you will need to install a Python Kafka client library. There are several options available, but the Python client we are going to use here is
pykafka. PyKafka and others all intend to provide a high-level, easy-to-use interface, when working with Kafka in Python. I’ll list a few other examples of other Python clients in a section below.
pykafka and others, we write Python code to produce and consume messages from Kafka topics, as well as perform other operations such as creating topics, modifying topic configurations, and monitoring the health of the Kafka cluster. No surprise there I hope.
Python with Kafka Producer Example
Here is a simple example of how to use
pykafka to produce a message to a Kafka topic in a python program:
from pykafka import KafkaClient # Create a Kafka client client = KafkaClient(hosts="localhost:9092") # Get the 'testie' topic topic = client.topics['testie'] # Create a producer producer = topic.get_sync_producer() # Send a message to the topic producer.produce(b"Hello Kafka! Coming to you to live from Python Kafka")
I’m going to show how to demo running this code later in this tutorial, so I saved this ^ example in a file called
Python Consumer Example
And now, here is an example of using Python to consume messages from the Kafka topic used in the previous example:
from pykafka import KafkaClient client = KafkaClient(hosts="localhost:9092") # Get the 'testie' topic topic = client.topics['testie'] # Create a consumer consumer = topic.get_simple_consumer() # Iterate through messages in the topic for message in consumer: if message is not None: print(message.offset, message.value)
Again, I saved this file too. I named the file
consumer.py but you can name it whatever you like. You could name it shazamo.py if you’d like. You’re call.
Python with Kafka Demo Examples
Let’s run the Python producer and consumer examples. Again, to make this as simple as possible, I’m going to run it against a single node Kafka cluster running in Docker. I’ve used this example numerous times in previous Kafka tutorials on this site including the Kafka Test Data Generation tutorial, so I’m going to be quick in order to keep this whole tutorial in under two minutes.
- Make sure Docker is running and
- Make sure Python is installed and available from CLI
- git clone https://github.com/conduktor/kafka-stack-docker-compose.git
- cd kafka-stack-docker-compose
- docker-compose -f zk-single-kafka-single.yml up -d
- pip3 install pykafka
- python3 produce.py
- python3 consumer.py
For example, when I ran the Kafka producer 3 times and then the Python consumer, it looked like the following:
Ctrl-C to exit.
The source code is available from Supergloo Kafka Examples Github repo.
What are available options for Python clients with Kafka?
There are several alternative to
pykafka Python clients available for working with Kafka.
kafka-python: This is a pure Python client for Kafka providing an interface for producing and consuming messages. It supports all of the core Kafka API operations, as well as advanced features like Kafka security and Kafka Connect integration.
confluent-kafka-python: This is a lightweight wrapper around the official Kafka C/C++ client library, librdkafka. It provides a high-level Producer, Consumer and AdminClient compatible with all Apache Kafka brokers >= version 0.8 as well as Confluent Cloud and Platform.
aiokafka: This is an asyncio-based Kafka client that is built on top of the kafka-python library. It provides support for features like automatic batching and compression.
Did I miss any? Let me know in the comments below.
What are pros and cons of using Python with Kafka?
Using Python with Kafka has a number of pros and cons including:
Python is a popular, easy-to-learn programming language which makes it a good choice for building Kafka-based applications.
Python has a large and active developer community. This translates to a wide variety of available libraries, frameworks, and tools when working with Kafka in Python. This also makes it easier to find support when working with Kafka in Python.
Python Kafka clients provide a high-level, easy-to-use interface for working with Kafka. This makes it easier to get started with Kafka in Python.
The usual knock on Python is it is not as performant as some other programming languages. This means Python-based Kafka applications will likely be not be as fast as those built in static typed languages.
Some Python Kafka clients, such as
pykafka, are implemented in pure Python.
Overall, whether or not using Python with Kafka is a good fit for your application depends on your needs and requirements.
If you are building a simple, low-throughput Kafka application which doesn’t require high performance, Python can be a great choice.
Under two minutes?