Kafka Quotas Simplified (Why and How)


Kafka quotas provide the ability to govern and control the broker resources used by Kafka clients. More broadly, with Kafka quotas, you can limit how the resources used on the entire Kafka cluster from Kafka clients.

Kafka quotas are used for primarily two reasons: 1) prevent misbehaving client(s) from unintentionally or intentionally attempting to adverse affect performance of the Kafka cluster and 2) again, prevent misbehaving clients from adversing affecting the performance of the cluster which in turn, affects other clients using the cluster. We’ll cover in more depth.

As we will explore in this Kafka Quotas tutorial, there are several types of quotas including: network bandwidth, request rate, and in more recent versions of Apache Kafka, connection rate.

Kafka quotas are most often used in multi-tenant Kafka clusters, but as previously mentioned, they can also be used to protect clients from themselves.

In this post, we will deep dive into the why and how of Kafka quotas and conclude with an example demonstration.

Table of Contents

Why Kafka Quotas?

Kafka producers and consumers can send, receive, and/or make a lot of requests at once. This can monopolize broker resources, overload the network, and even denial-of-service (DOS) other clients and the brokers themselves.

There are a few reasons why Kafka quotas are important:

Preventing resource contention: Kafka is a distributed system, so multiple clients or users may be using the same cluster, at the same time. Without quotas, it’s possible for one client to use too many resources, which could cause problems and affect how well other clients can do their jobs. This is commonly referred to as a noisy neighbor.

Fairness: Quotas can help make sure that all clients or users get an equal amount of resources, no matter how big or busy they are. This can help make sure that smaller clients can get the resources they need and that bigger clients can’t use up all the resources.

Capacity planning: Quotas can be used for establishing service level agreements (SLAs) and capacity planning by limiting how much a client or user can use of a resource. This can help ensure the cluster stays within its operational limits and prevent outages or a drop in performance caused by running out of resources.

Overall, Kafka quotas help a Kafka cluster runs smoothly and efficiently by governing how resources are shared fairly and in turn, how to accommodate capacity planning.

Noisy neighbors and Kafka Quotas

Up until this point, we’ve focused the Kafka perspective only, but this challenge and addressing through quotas is definitely not unique to Kafka. The need for limiting or throttling client applications is often referred to as the noisy neighbor in software.

What is noisy neighbor?

The “noisy neighbor” challenge is a term used in software architecture to describe a situation in which a single application or user uses an unfair amount of shared resources, such as CPU, memory, or network bandwidth.

In this scenario, performance and availability of other applications are affected by users who share the same resources.

In a shared environment such as a multi-tenant Kafka cluster, one client application using too many resources can degrade the performance of other applications or even cause them to stop working all together. This is commonly referred to as the “noisy neighbor” problem.

Ways to solve noisy neighbor?

To deal with the “noisy neighbor” problem, techniques such as resource isolation, resource allocation, and resource quotas are implemented to ensure each application uses their designated share of the shared resource.

This can protect all applications from being affected by one or multiple noisy neighbors.

Types of Kafka Quotas

There are three types of Kafka quotas:

  • Network bandwidth quotas defined byte-rate thresholds (since 0.9)
  • Request rate quotas defined by CPU utilization thresholds as a percentage of network and I/O threads (since 0.11)
  • Connection rate quotas defined by per-IP connection rate (since 2.8)

Network bandwidth quotas for produce and/or consume are created as the byte rate threshold for each group of clients.

Network Bandwidth Quota Examples

Let’s take a look at an example of creating a Kafka Bandwidth quota.

In Pre Kafka 2.6–

kafka-configs.sh --zookeeper $ZOO --alter --add-config 'producer_byte_rate=1024' --entity-type users --entity-name <authenticated-principal>

and in Kafka 2.6 or greater, as you may expect, we can use bootstap-server instead of zookeeper variable–

kafka-configs.sh --bootstrap-server $BOOT --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=1024' --entity-type users --entity-name <authenticated-principal> --command-config /tmp/client.properties

This last example is a bit more complicated as it also shows an example of using kafka-configs.sh when needing to authenticate to run the command; i.e. –command-config. See the previous Kafka Authentication tutorial if interested in learning more.

Request rate quotas are defined as percentage of time a client can utilize on both request handler I/O threads and network threads of each broker within a time window. What this “window” is will be covered later in the tutorial.

Here, a quota of n% represents n% of one thread. A request rate quota is out of a total capacity of ((num.io.threads + num.network.threads) * 100)% where num.io.threads and num.network.threads are configuration values on the Kafka broker.

Request Rate Quota Example

Let’s take a look at another example where we set the request rate quota.

kafka-configs.sh --bootstrap-server $BOOT --alter --add-config 'request_percentage=150' --entity-type users --entity-name big-time-tv-show-host --command-config /tmp/client.properties

How to set Kafka Quotas

As we’ve seen in the previous examples, Kafka quotas may be configured using the kafka-configs.sh CLI tool. See the previous Kafka config with kafka-configs.sh tutorial if you want a closer look at kafka-configs.sh.

Quota configuration may be defined for user and client-id groups

The order of precedence for quota configuration is:

kafka quotas precedence

Kafka clients can be identified by 2 metadata values:

  • client-id: this is an optional configuration property set in the client; client.id
  • user: the authenticated principal

Both metadata values can be used when identifying a Kafka client application. It’s easier to manage using an authenticated principal, because it is likely required to connect to the Kafka cluster in the first place.

Using client-id it is more difficult because there straightforward way to enforce it in clients.

I will use an authenticated principal in Kafka quota demo below, but first let’s describe how kafka quotas are enforced and what the clients will experience.

How are Kafka Quotas Enforced?

When a broker notices a quota violation, it calculates an estimate of the amount of delay required to bring the infringing client back under its quota.

At that point, the broker will do two things: mute the socket channel and respond back to the client with a delay request.

Depending on the client, the client may either respect this delay or ignore it.

In either case, when quotas are reached the client is throttled, but does not fail. Whether or not the client will fail is a common concern/ask in my experience.

A demonstration of Kafka throttling is shown below.

How do Kafka clients address being throttled by a Kafka quota?

Kafka clients that respect the delay request sent from the broker will not send any more requests to the broker. It will honor the delay request.

But, just in case the client doesn’t listen, the broker also mutes the channel until the delay is over.

This means even older Kafka clients, which do not adhere to the delay request, are still effectively throttled anyhow.

How? Because the client’s requests are blocked by both the server and the client.

How are Quotas Calculated and Measured?

To identify quota violations, byte-rate and thread utilization are measured throughout a number of brief intervals, such as 30 windows of one second each.

Kafka Quotas Demo

Let’s run through a few examples of Kafka quotas. All the files used in the demo are available from Github repo.

This environment setup is very similar to what was used in the previous Kafka authorization examples.

Kafka Quotas Demo Requirements

Hopefully obvious, but you can adjust the demo above as necessary for your environment. For the demo I’m running, these are are required:

  • Apache Kafka downloaded and extracted, so can use the CLI scripts in the bin/ directory
  • The kafka-examples repo cloned and terminal in the quotas/ directory
  • Docker running and docker-compose available

Kafka Quotas Demo Steps

  1. Start the containers; i.e. docker-compose -f kafka-quotas-example.yml up -d
  2. set KAFKA_HOME and BOOT environment variables, so next step will work. For example, export KAFKA_HOME=~/dev/kafka_2.13-2.8.1; export BOOT=localhost:9092
  3. Run the effects.sh to show performance characteristics of principal alice before implementing a quota with ./effects.sh ./alice-client.properties. (Note: you probably need to make it executable; i.e. chmod 755 effects.sh to run it)
  4. Set produce and consume quota for alice: $KAFKA_HOME/bin/kafka-configs.sh --bootstrap-server $BOOT --alter --add-config 'producer_byte_rate=5000000,consumer_byte_rate=5000000' --entity-type users --entity-name alice --command-config ./admin-client.properties
  5. Re-run effects.sh again to show alice being throttled by the quota.

Kafka Quota Demo Video

Kafka Quota Demo Screenshots

In case, you don’t feel like watching the previous video, here are the key screenshots.

Here’s a screenshot before produce quota

before kafka quota

which shows roughly ~20-22 MB/sec and after the quota is in effect

after kafka quota

shows less than 6 MB/sec.

Further Resources Kafka Quota

The Kafka documentation has a section on quotas

https://kafka.apache.org/documentation/#design_quotas

But, at the time of this writing, the Kafka documentation doesn’t include newer quota such as limiting connection creation rate found in KIP-612

https://cwiki.apache.org/confluence/display/KAFKA/KIP-612%3A+Ability+to+Limit+Connection+Creation+Rate+on+Brokers

All source used in demo https://github.com/supergloo/kafka-examples/tree/master/quotas

Monitoring Kafka Quotas

I don’t show it in this tutorial, but let me know if you’d like to see something on monitoring quotas in JMX. I’ve been meaning to write a Kafka monitoring tutorial anyhow.

Hope this helps, let me know if any questions, concerns, or suggestions for improvement.

See also  Navigating Compatibility: A Guide to Kafka Broker API Versions
About Todd M

Todd has held multiple software roles over his 20 year career. For the last 5 years, he has focused on helping organizations move from batch to data streaming. In addition to the free tutorials, he provides consulting, coaching for Data Engineers, Data Scientists, and Data Architects. Feel free to reach out directly or to connect on LinkedIn

Leave a Comment