Kafka Test Data Generation Examples

Kafka Test Data Generation

After you start working with Kafka, you will soon find yourself asking the question, “how can I generate test data into my Kafka cluster?”  Well, I’m here to show you have many options for generating test data in Kafka.  In this post and demonstration video, we’ll cover a few of the ways you can generate test data into your Kafka cluster.

Now, before we begin, let’s cover a possible edge case.  If you are wondering about test data in Kafka Streams applications, you might find my previous post on testing Kafka Streams helpful. Well, also, I might find it helpful if you read it and comment on it too.

With that out of the way, let’s go through a few of your options.  I’ll cover ways to generate test data in Kafka from both Apache Kafka and Confluent Platform.

Kafka Test Data Screencast (AKA: Big Time TV Show)

Check out the screencast below to see a demo of examples using kafkacat, Kafka Connectors Datagen and Voluble and finally, ksql-datagen

Part 1 with Kafkacat

Our first example utilizes the kafkacat which is freely available at https://github.com/edenhill/kafkacat/

Here are the steps (more or less) in the above screencast

  1. Start Zookeeper and Kafka on localhost
  2. kafkacatis installed and in my path
  3. cat /var/log/system.log | kafkacat -b localhost:9092 -t syslog
  4. kafkacat -b localhost:9092 -t syslog -J
  5. curl -s “http://api.openweathermap.org/data/2.5/weather?q=Minneapolis,USA&APPID=my-key-get-your-own” |\
    kafkacat -b localhost:9092 -t minneapolis_weather -P
  6. kafkacat -b localhost:9092 -t minneapolis_weather
  7. Show other fun, good time resources such as Mockeroo and JSON-server

Test Data with Apache Kafka Connect Options

There are a couple of available Kafka Connect source connectors to assist in generating test data into Kafka.   There is the Kafka Connect Datagen connector which has been around for a while.  The Datagen connector includes two quickstart schemas to ahh, well, you know, get you started quickly.  See the Reference section below for the link.

In the screencast, I showed how both connectors are already installed.

Next, run some commands such as

  1. confluent local config datagen-pageviews — -d ./share/confluent-hub-components/confluentinc-kafka-connect-datagen/etc/connector_pageviews.config (your path might be different)
  2. kafkacat -b localhost:9092 -t pageviews

Next, we switched to another option for Kafka Connect based Kafka mock (or stub) data generation is a connector called Voluble.  I like how it integrates the Java Faker project which provides support for creating cross-topic relationships such as seen the examples

'genkp.users.with' = '#{Name.full_name}'
'genvp.users.with' = '#{Name.blood_group}'

'genkp.publications.matching' = 'users.key'
'genv.publications.title.with' = '#{Book.title}'

See how the users.keyis referenced in the above example.  Anyhow, much more documentation available from Github repo in the link below.

Steps with Voluble

  1. Listed topics kafka-topics – list – bootstrap-server localhost:9092
  2. Then, I loaded using a sample properties file found in my Github repo.  See the Resources below.
  3. confluent local load voluble-source – -d voluble-source.properties (bonus points and a chance to join me on a future Big Time TV Show if you post how to load it in vanilla Kafka in the comments below.  )
  4. kafka-topics – list – bootstrap-server localhost:9092
  5. kafkacat -b localhost:9092 -t owners

Kafka Test Data in Confluent Platform

If you are a user of the Confluent Platform, you have an easy button available from the CLI with ksql-datagentool.  It has a couple of quickstart schemas to get you rolling quickly as shown in the following screencast

Quickly, let’s run through the following commands

  1. ksql-datagen quickstart=orders format=avro topic=orders maxInterval=100
  2. confluent local consume orders – – value-format avro – from-beginning
  3. kafkacat -b localhost:9092 -t orders -s avro -r http://localhost:8081

Resources and Helpful References


Featured image credit https://pixabay.com/photos/still-life-bottles-color-838387/

Leave a Reply

Your email address will not be published. Required fields are marked *