Kafka namespaces are not directly supported in Apache Kafka, but there are two ways to implement namespace-like capability in Kafka. In this Kafka namespaces tutorial, we’ll cover both examples, history, options, why you might need namespaces, and much more. Let’s go.
A quick note on how this tutorial is configured.
In the beginning, I am going to set some context and brief history of Kafka namespaces, Kafka topic namespaces, and enforcing Kafka topic naming conventions.
Then, towards the end there are two examples and screencast demo, if you want to just skip to it. I’m trying something new with this demo. Some of you may just wish to jump straight to the examples. If that’s you, feel free to skip to section starting with “How To” if you’re just looking for examples.
Table of Contents
- Does Kafka support Namespaces?
- Kafka Namespaces History
- Why Kafka Namespaces?
- Kafka Namespacing in Multi-Tenant Kafka clusters
- Kafka Namespace for Tenant Monitoring and Possible Chargebacks
- How to implement Kafka Topic Namespaces?
- Kafka Namespaces Resources
Does Kafka support Namespaces?
Short answer: No. Longer answer: namespacing is not built into Apache Kafka, but it is possible to implement something which acts similar to namespaces by using, and enforcing, a standard way to name topics.
Before getting into the details, let’s briefly cover some background and context first.
In Kafka, as you already know, data is divided and organized into topics.
But did you can make a namespace for a group of topics by adding a naming prefix or a suffix?
For example, if you want to namespace all the topics related to a certain project, you can use a prefix like “cdc.sqlserver0566.inventory.” or “tracking.website.clicks” or “push.mainframe.account”, etc. for all the topics related to the particular project or line of business.
Classic article on Kafka topic naming conventions https://cnr.sh/essays/how-paint-bike-shed-kafka-topic-naming-conventions
Anyhow, the point here is you can enforce topic naming conventions or “namespaces” which are shown below.
Next, after deciding a naming convention or namespace, we need to consideration authorization and authentication.
Kafka has ACLs (Access Control Lists) and flexible rules to control who can see what in topics and partitions. Kafka ACLs let you limit access to certain topics or partitions to a certain group of users or programs. We can combine Kafka ACLs with our desired namespaces.
Kafka ACLs are covered in much more detail in the Kafka authorization tutorial, by the way.
So, to summarize the overview, even though Kafka does not have namespaces built in, you can achieve similar results by using naming conventions combined with access control lists (ACLs).
Next, let’s take a quick tour of history.
Kafka Namespaces History
From a historical perspective, there is an old KIP-37 and the associated JIRA KAFKA-2630 which describes a desired outcome of Kafka namespace implementation in Kafka.
They are both quite old and do not seem to have much chance to be implemented anytime soon. Please correct me if I’m wrong.
Of note, the lack of namespace in Kafka has been implemented by other stream storage implementations such as Apache Pulsar. I’m not sure if that was on purpose or not, but namespaces in Pulsar is a different topic entirely.
All this means the approach shown below is most likely the best option for some time to come.
Why Kafka Namespaces?
There are many reasons you may want to use namespaces in Kafka.
- To organize topics. Namespaces may also prevent different projects or Kafka client applications from using the same names.
- Namespaces can make it easier for developers and administrators to understand and manage the Kafka environment by giving topics clear and consistent names.
- Using namespaces can also help improve Kafka’s security and access control. By putting topics in different namespaces, you can put different ACLs (Access Control Lists) on each namespace. This lets you control which users or applications can access which topics or partitions.
Overall, namespaces can help your Kafka environment be more organized, secure, and easy to manage, especially as it grows and gets more complicated. It can also help tremendously in multi-tenant Kafka clusters.
Kafka Namespacing in Multi-Tenant Kafka clusters
In a multi-tenant environment where different teams or customers share the same Kafka cluster, Kafka topic namespacing can be especially effective when implementing risk mitigation strategies.
For example, without proper authorization isolation, one tenant could accidentally, or on purpose, read or write messages from the topics of another tenant. This could cause data to leak or become corrupted.
By using namespaces, you can clearly separate the topics and data streams of different tenants. This makes it easier to manage and keep track of their activity in the cluster. Since Kafka ACLs support wildcards, you can put different Access Control Lists (ACLs) on each namespace topic prefix. This lets you control which tenants or users can see which topics or partitions in a consistent, and easier to manage manner.
Also relevant here is using Kafka Quotas in multi-tenant clusters.
In multi-tenant clusters, using namespaces can make it easier to manage and fix problems in a Kafka cluster. By enforcing topics with clear and consistent names, namespaces can make it easier for developers and administrators to understand and navigate the Kafka environment, even as it gets more complicated with more tenants and topics over time.
An additional benefit of using topic naming namespaces in a multi-tenant Kafka clusters, is it is much easier to monitor usage across applications which will be covered next.
Kafka Namespace for Tenant Monitoring and Possible Chargebacks
Another benefit of implementing Kafka topic namespacing is how it can make monitoring and measuring of Kafka tenants much more straightforward. What do I mean?
Well, for example, you may wish to monitor the ingress and egress bytes of tenants. Or, you may wish to monitor and measure the total storage used by tenants.
By implementing Kafka topic namespacing, this kind of monitoring and measuring becomes much more simple and achievable because you have a naming convention by which you can separate and organize.
You determine total storage of all topics with the names starting with “cdc.database456.” for example. I’m sure you get the idea.
Why Monitor Kafka namespaces?
As briefly mentioned, you may wish to monitor certain attributes of your namespaces in Kafka. But, let’s name a few specific reasons why.
- You want to compare and contrast your tenants usage to ensure fairness or capacity planning or monitor a certain SLA.
- You may want to implement chargebacks. Meaning, you may wish to charge back to certain tenants based on usage. This may be for technical reasons or more often, financial accounting reasons.
How to implement Kafka Topic Namespaces?
At the time of this writing, there are two methods to enforce a topic name convention, or Kafka namespaces, in Kafka.
- Use prefix ACLs as described in KIP-290
- Define a custom
CreateTopicPolicy
class as described in KIP-108
I’m going to describe the exact steps and provide source code for both of these examples next. But before going through all the written steps below, it may be helfpul to watch this quick screencast of me demonstrating both of the examples.
Before watching, I need to ask. Do you like to have fun? I do. Let’s have fun in this screencast by playing a game I like to call “what kind of fish is that?” You’ll see what I mean.
Kafka Namespaces Demo Examples Requirements
- docker-compose
- Docker
- A Kafka distribution downloaded, so access to shell scripts in
bin/
directory - Clone
kafka-examples
Github repo (link below) or thetopic-namespaces/
directory from the repo specifically - [Optional] a willingness to see pictures of me with a mystery fish (see screencast above)
Kafka Topic Namespaces with Prefix ACLs Example
This is an example leveraging KIP-290. While this example will show end-to-end, the key to watch for is in kafka-acls.sh arguments shown in step 4.
I go through all these exact steps in the screencast video afterwards.
- Clone the
kafka-examples
repo (Github link below) and open a shell intopic-namespaces/
directory docker-compose -f kafka-namespaces-examples.yml up -d
export KAFKA_HOME=~/dev/kafka_2.13-2.8.1; export BOOT=localhost:9092
(according to your environment)$KAFKA_HOME/bin/kafka-acls.sh --bootstrap-server $BOOT --add --allow-principal User:alice --operation Create --topic 'inventory-' --resource-pattern-type prefixed --force --command-config admin-client.properties
- Try
$KAFKA_HOME/bin/kafka-topics.sh --create --topic invblablah-example --bootstrap-server localhost:9092 --command-config ./alice-client.properties
and it does not work $KAFKA_HOME/bin/kafka-topics.sh --create --topic inventory-example --bootstrap-server localhost:9092 --command-config ./alice-client.properties
and this works- By the way,
$KAFKA_HOME/bin/kafka-topics.sh --list --bootstrap-server localhost:9092 --command-config ./alice-client.properties
does not show any topics? Why not? - Need to allow Describe. For example,
$KAFKA_HOME/bin/kafka-acls.sh --bootstrap-server $BOOT --add --allow-principal User:alice --operation Describe --topic 'inventory-' --resource-pattern-type prefixed --force --command-config admin-client.properties
- Now can see
$KAFKA_HOME/bin/kafka-topics.sh --list --bootstrap-server localhost:9092 --command-config ./alice-client.properties
By the way, there are a two things to note about the Kafka cluster used in this example. If you open up the kafka-namespaces-examples.yml
file, note there are two required settings:
KAFKA_ALLOW_EVERYONE_IF_NO_ACL_FOUND: "false"
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "false"
As already mentioned, it is a requirement to have both of these config variables set to false. The names here are set according to conventions of the docker image, but should be able to easily translate to your environment such as allow.everyone.if.no.acl.found
Also, remember folks, lock down your Zookeeper, because it is possible to bypass all of this. For example, switching to use zookeeper
instead of bootstrap-server
will succeed:
$KAFKA_HOME/bin/kafka-topics.sh --create --topic invsadfsadfentory-example --zookeeper localhost:2181 --partitions 3 --replication-factor 1 --command-config ./alice-client.properties
Kafka Topic Naming Convention with Custom Class
Personally, I like the previous example better than this one, because I can control by the requesting principal. But, for completeness, here is another approach on how to use a custom class to enforce Kafka topic naming conventions.
The custom class is in src/main/java/com/supergloo/
and is called CustomTopicPolicy
I go through how to compile, set configuration, run, and verify this example in the screencast above.
If you are going to try this, you’ll need gradle installed and able to compile Java.
Ok, here we go.
- Open terminal and go to
topic-namespaces/
directory gradle jar
- Uncomment two lines from the
kafka-namespaces-examples.yml
compose file. The two lines are shown below. - stop, if running, and then start environment with
docker-compose -f kafka-namespaces-examples.yml up -d
so these config changes are picked up. - If not set already,
export KAFKA_HOME=~/dev/kafka_2.13-2.8.1; export BOOT=localhost:9092
- Attempt to create a topic with a name which doesn’t conform to the policy; i.e.
$KAFKA_HOME/bin/kafka-topics.sh --create --topic invblablah-example --bootstrap-server localhost:9092 --command-config ./admin-client.properties
$KAFKA_HOME/bin/kafka-topics.sh --create --topic inventory-example --bootstrap-server localhost:9092 --command-config ./admin-client.properties
The following two lines from kafka-namespaces-examples.yml were uncommented
KAFKA_CREATE_TOPIC_POLICY_CLASS_NAME: "com.supergloo.CustomTopicPolicy
and
- ./app.jar:/opt/kafka/libs/app.jar
Also, notice how the last two commands were run as the admin superuser. If we were to try with alice, the principal wouldn’t have permission to create a new topic.
My container running Kafka has id 973f40a76e20 and when I look at the logs, I can see output for the CustomPolicyTopic