Python Kafka in Two Minutes. Maybe Less.


Although Apache Kafka is written in Java, there are Python Kafka clients available for use with Kafka.  In this tutorial, let’s go through examples of Kafka with Python Producer and Consumer clients. 

Let’s consider this a “Getting Started” tutorial. 

After completing this, you will be ready to proceed to more complex examples.  But we need to get started some place, so let’s get started now.

After you complete this (in under two minutes?), let me know if you’d like to cover any other topics of using Python with Kafka.

Table of Contents

Kafka Python Overview

To use Python with Kafka, you will need to install a Python Kafka client library. There are several options available, but the Python client we are going to use here is pykafka. PyKafka and others all intend to provide a high-level, easy-to-use interface, when working with Kafka in Python. I’ll list a few other examples of other Python clients in a section below.

With pykafka and others, we write Python code to produce and consume messages from Kafka topics, as well as perform other operations such as creating topics, modifying topic configurations, and monitoring the health of the Kafka cluster. No surprise there I hope.

Kafka Python client examples

Python with Kafka Producer Example

Here is a simple example of how to use pykafka to produce a message to a Kafka topic in a python program:

from pykafka import KafkaClient

# Create a Kafka client
client = KafkaClient(hosts="localhost:9092")

# Get the 'testie' topic
topic = client.topics['testie']

# Create a producer
producer = topic.get_sync_producer()

# Send a message to the topic
producer.produce(b"Hello Kafka!  Coming to you to live from Python Kafka")

I’m going to show how to demo running this code later in this tutorial, so I saved this ^ example in a file called producer.py.

Python Consumer Example

And now, here is an example of using Python to consume messages from the Kafka topic used in the previous example:

from pykafka import KafkaClient

client = KafkaClient(hosts="localhost:9092")

# Get the 'testie' topic
topic = client.topics['testie']

# Create a consumer
consumer = topic.get_simple_consumer()

# Iterate through messages in the topic
for message in consumer:
    if message is not None:
        print(message.offset, message.value)

Again, I saved this file too. I named the file consumer.py but you can name it whatever you like. You could name it shazamo.py if you’d like. You’re call.

Python with Kafka Demo Examples

Let’s run the Python producer and consumer examples. Again, to make this as simple as possible, I’m going to run it against a single node Kafka cluster running in Docker. I’ve used this example numerous times in previous Kafka tutorials on this site including the Kafka Test Data Generation tutorial, so I’m going to be quick in order to keep this whole tutorial in under two minutes.

  1. Make sure Docker is running and docker-compose is installed
  2. Make sure Python is installed and available from CLI
  3. git clone https://github.com/conduktor/kafka-stack-docker-compose.git
  4. cd kafka-stack-docker-compose
  5. docker-compose -f zk-single-kafka-single.yml up -d
  6. pip3 install pykafka
  7. python3 produce.py
  8. python3 consumer.py

For example, when I ran the Kafka producer 3 times and then the Python consumer, it looked like the following:

Python with Kafka Demo examples

Ctrl-C to exit.

The source code is available from Supergloo Kafka Examples Github repo.

What are available options for Python clients with Kafka?

There are several alternative to pykafka Python clients available for working with Kafka.

kafka-python: This is a pure Python client for Kafka providing an interface for producing and consuming messages. It supports all of the core Kafka API operations, as well as advanced features like Kafka security and Kafka Connect integration.

confluent-kafka-python: This is a lightweight wrapper around the official Kafka C/C++ client library, librdkafka. It provides a high-level Producer, Consumer and AdminClient compatible with all Apache Kafka brokers >= version 0.8 as well as Confluent Cloud and Platform.

aiokafka: This is an asyncio-based Kafka client that is built on top of the kafka-python library. It provides support for features like automatic batching and compression.

Did I miss any? Let me know in the comments below.

What are pros and cons of using Python with Kafka?

Using Python with Kafka has a number of pros and cons including:

Pros

Python is a popular, easy-to-learn programming language which makes it a good choice for building Kafka-based applications.

Python has a large and active developer community. This translates to a wide variety of available libraries, frameworks, and tools when working with Kafka in Python. This also makes it easier to find support when working with Kafka in Python.

Python Kafka clients provide a high-level, easy-to-use interface for working with Kafka. This makes it easier to get started with Kafka in Python.

Cons

The usual knock on Python is it is not as performant as some other programming languages. This means Python-based Kafka applications will likely be not be as fast as those built in static typed languages.

Some Python Kafka clients, such as pykafka, are implemented in pure Python.

Closing Thoughts

Overall, whether or not using Python with Kafka is a good fit for your application depends on your needs and requirements.

If you are building a simple, low-throughput Kafka application which doesn’t require high performance, Python can be a great choice.

Under two minutes?

See also  How To Generate Kafka Streaming Join Test Data By Example
About Todd M

Todd has held multiple software roles over his 20 year career. For the last 5 years, he has focused on helping organizations move from batch to data streaming. In addition to the free tutorials, he provides consulting, coaching for Data Engineers, Data Scientists, and Data Architects. Feel free to reach out directly or to connect on LinkedIn

Leave a Comment