To implement a multi tenant Kafka architecture, several requirements need to be addressed in order to increase your chances of success. In this post, we will list and describe four requirements in multi tenant Kafka architectures which can lead to one optional configuration benefit. The final benefit will only be interesting depending on your unique circumstances.
I see more and more folks moving away from big, multi-tenant Kafka clusters towards smaller, multiple, isolated Kafka clusters with specific replication between clusters where required, but nonetheless, there are still people attempting to operate their existing multi-tenant clusters effectively.
In this post, let’s list the key considerations and briefly describe each. In each, I’ll reference a link to other tutorials on this site to learn more. Let me know if you have other ideas or suggestions.
Table of Contents
- Kafka Multi Tenant Architecture Overview
- Kafka Authentication
- Kafka Authorization
- Kafka Quotas
- Kafka Topic Namespaces
- Kafka Tenant Monitoring and Possible Charge Backs [Optional]
- Kafka Multi Tenant Summary
- Kafka Multi Tenant FAQs
- Kafka Multi Tenant Kafka Further resources
Kafka Multi Tenant Architecture Overview
When operating a Kafka multi tenant architecture, there are four implementation requirements: 1) authentication, 2) authorization, 3) quotas, and 4) topic namespaces. I think it can be argued whether or not the fifth item on my list can be considered optional or not, but it is 5) tenant based monitoring.
Let’s get into each one of these requirements and why it is important.
Because Kafka is used to create streaming applications and real-time data pipelines from a variety of sources, it can send and receive sensitive data. Naturally, this must be protected from unwanted access. To ensure only authenticated users and apps can access the data, Kafka offers a variety of ways for authentication and authorization.
In Kafka, authentication is crucial for a number of reasons. First, it makes sure that only approved users and programs can access the Kafka cluster. That’s easy. Second, a benefit of authentication is accountability, because it shows who and when accessed the data in an authorizer log. This aids in meeting audits and compliance standards. Thirdly, authentication makes it possible for administrators to impose access restrictions on certain topics or partitions within a topic, enabling fine-grained access control. In other words, it is a predecessor to the next requirement Kafka authorization.
There are a few tutorials on Kafka Authentication on this site, but let me know if you want more or suggestions for others.
In multi-tenant setups, effective authorization processes are required in order to ensure authenticated users/applications can only access only what is needed and nothing else.
The practice of providing, or refusing, access to particular topics based on preset policies is known as Kafka authorization. Kafka can offer a number of permission methods, such as:
Access Control Lists (ACLs): An easy method for administrators to specify a list of users or groups and the permissions they have on particular subjects is the usage of ACLs.
Role-Based Access Control (RBAC): This more complicated technique enables administrators to specify roles and assign them to individuals or groups of people. Permissions assigned to each role specify what activities are permitted on particular subjects or partitions. RBAC is not offered in the community versions of Apache Kafka, but proprietary options are available.
In multi-tenant Kafka clusters, authorization is essential for a number of reasons. One, it makes sure that each tenant can only view the information to which they have been granted access, shielding important data from unwanted access. Second, by limiting who has access to the data, administrators can enforce compliance and auditing requirements. The last feature is fine-grained access control, which enables administrators to limit access to particular topics based on established criteria.
In essence, Kafka permissions management is crucial in multi-tenant clusters of Kafka to ensure each tenant can only access the data they are authorized to view.
See Kafka authorization tutorial for a more detailed analysis and example demonstrations.
Next up, how to ensure each user uses their fair share of the shared Kafka cluster.
Administrators can restrict how many resources a user or application can utilize in a Kafka cluster by using the quotas feature. Kafka quotas are crucial in multi-tenant Kafka clusters because they ensure equitable resource distribution across all tenants and help reduce resource conflict.
Without effective resource allocation and management, certain tenants may use too many resources in a multi tenant Kafka cluster. This could affect other tenants’ performance and cause problems. By enabling administrators to establish restrictions on the amount of resources each tenant may use, Kafka quotas solve this issue.
Kafka offers several different categories of quotas, such as: Producer quotas, Consumer quotas, and Request quotas.
Administrators may define quotas at the user or application level to distribute resources according to the demands of each tenant. Quotas can also be dynamically changed, enabling administrators to adapt to shifting resource needs.
In summary, Kafka quotas are crucial in multi-tenant Kafka clusters because they ensure equitable resource distribution across all tenants and help reduce resource conflict.
For a much deeper dive, see the previous Kafka Quotas What, Why and How post.
Kafka Topic Namespaces
Kafka topic namespaces permits administrators to organize topics into logical categories, or namespaces. In multi-tenant Kafka clusters, implementing topic namespacing provides a way to separate topics by tenant or use case, making it much simpler to manage and secure the Kafka-processed data.
By organizing topics into namespaces, administrators can enforce access control policies at the level of the namespace, as opposed to the level of the topic. This simplifies the management of access control policies for a large number of topics and ensures that occupants only have access to the data they are authorized to access.
In conclusion, Kafka topic namespacing is essential in multi-tenant Kafka clusters because it provides a method to organize and manage topics by tenant or use case, making it easier to manage and secure the Kafka-processed data. By grouping topics into namespaces, administrators can enforce access control policies, improve performance, and more efficiently administer retention policies.
For more depth and examples are covered in the Kafka Namespaces Tutorial.
Kafka Tenant Monitoring and Possible Charge Backs [Optional]
Monitoring a Kafka cluster is a requirement. Hard stop, right? Not much more needed to describe here. If you want to ensure smooth operation of your Kafka cluster, you need to implement monitoring and even more preferable, monitoring with alert monitoring. This is true whether your cluster is single or multi tenant.
The previously mentioned Kafka topic namespacing can help monitor tenant usage in a multi-tenant Kafka cluster. By organizing topics into logical namespaces, administrators can more easily track resource utilization and performance metrics for each tenant.
But, also, and here’s what I consider the optional bit. If you implement all 5 of these suggestions, you are now in best position to implement tenant based charge backs. To put it another way, this could be charging back or allocating service costs by tenant.
I am hearing more and more lately about how to implement charge backs in multi tenant Kafka clusters, but also, I do hear about solving this another way too. The other way to solve this is to ditch the multi tenant approach and segment tenants into dedicated clusters. Per usual, there are pros and cons to this approach, but it is a subject best left to a different thread.
Kafka Multi Tenant Summary
In summary, implementing a multi-tenant Kafka architecture requires access control mechanisms, topic namespaces, efficient resource allocation and management through quotas, and monitoring. By considering these requirements, organizations can successfully implement a multi-tenant Kafka cluster.
Kafka Multi Tenant FAQs
What are the typical challenges or concerns when implementing multi-tenancy in Kafka?
The biggest issue I see with multi-tenant Kafka cluster is known as the “noisy neighbor” challenge. A noisy neighbor is a tenant which causes issues for all tenants of the cluster by abusing resources provided in the cluster. For example, a tenant application may be miscoded to open and close connections rapidly. This can a burden on all brokers in the cluster which leads to degraded performance for all tenants of the Kafka cluster. The requirements described above address noisy neighbor concerns.
What are alternatives to multi-tenant Kafka clusters?
With advent of managed Kafka providers as well as tooling to support architectures which replicate all or portions of Kafka topics from one Kafka cluster to another, many companies are choosing to implement more, isolated Kafka clusters focused on particular workloads. This has numerous benefits including the reduction of the “blast radius” if there are problems on a Kafka cluster. See previous question’s answer which describes the noisy neighbor issue.