Apache Kafka and Amazon Kinesis – How do they compare?

apache kafka

Apache Kafka vs. Amazon Kinesis

Like many of the offerings from Amazon Web Services, Amazon Kinesis software is modeled after an existing Open Source system.  In this case, Kinesis is modeled after Apache Kafka.

Kinesis is known to be incredibly fast, reliable and easy to operate.  Similar to Kafka, there are plenty of language specific clients available including Java, Scala, Ruby, Javascript (Node), etc.

Amazon Kinesis has a built-in cross replication while Kafka requires configuration to be performed on your own. Cross-replication is the idea of syncing data across logical or physical data centers.  Cross-replication is not mandatory, and you should consider doing so only if you need it.

Engineers sold on the value proposition of Kafka and Software-as-a-Service or perhaps more specifically Platform-as-a-Service have options besides Kinesis or Amazon Web Services.  Keep an eye on http://confluent.io.

When to use Kafka or Kinesis?

Kafka or Kinesis are often chosen as an integration system in enterprise environments similar to traditional message brokering systems such as ActiveMQ or RabbitMQ.   Integration between systems is assisted by Kafka clients in a variety of languages including Java, Scala, Ruby, Python, Go, Rust, Node.js, etc.

Other use cases include website activity tracking for a range of use cases including real-time processing or loading into Hadoop or analytic data warehousing systems for offline processing and reporting.

An interesting aspect of Kafka and Kinesis lately is the use in streaming processing.  More and more applications and enterprises are building architectures which include processing pipelines consisting of multiple stages.  For example, a multi-stage design might include raw input data consumed from Kafka topics in stage 1.  In stage 2, data is consumed and then aggregated, enriched, or otherwise transformed. Then, in stage 3, the data is published to new topics for further consumption or follow-up processing during a later stage.

Conclusion

Keep an eye on supergloo.com for more articles and tutorials on Kafka, Kinesis and other stacks used in data processing and pipelines using streams.

References

 

 

Featured image credit https://flic.kr/p/7XWaia

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.