Scaling Apache Kafka Clusters on Confluent Cloud ft. Ajit Yagaty and Aashish Kohli

How much can Apache Kafka® scale horizontally, and how can you automatically balance, or rebalance data to ensure optimal performance?You may require the flexibility to scale or shrink your Kafka clusters based on demand. With experience engineering cluster elasticity and capacity management features for cloud-native Kafka, Ajit Yagaty (Confluent Cloud Control Plane Engineering) and Aashish Kohli (Confluent Cloud Product Management) join Kris Jenkins in this episode to explain how the architecture of Confluent Cloud supports elasticity. Kris suggests that optimal elasticity is like water from a faucet—you should be able to quickly obtain as many resources as you need, but at the same time you don't want the slightest amount to go wasted. But how do you specify the amount of capacity by which to adjust, and how do you know when it's necessary?Aashish begins by explaining how elasticity on Confluent Cloud has come a long way since the early days of scaling via support tickets. It's now self-serve and can be accomplished by dialing up or down a desired number of CKUs, or Confluent Units of Kafka. A CKU corresponds to a specific amount of Kafka resources and has been made to be consistent across all three major clouds. You can specify the number of CKUs you need via API, CLI or Confluent Cloud UI. Ajit explains in detail how, once your request has been made, cluster resizing is a two-step process. First, capacity is added, and then your data is rebalanced. Rebalancing data on the cluster is critical to ensuring that optimal performance is derived from the available capacity. The amount of time it takes to resize a Kafka cluster depends on the number of CKUs being added or removed, as well as the amount of data to be rebalanced. Of course, to request more or fewer CKUs in the first place, you have to know when it's necessary for your Kafka cluster(s). This can be challenging as clusters emit a large variety of metrics. Fortunately, there is a single composite metric that you can monitor to help you decide, as Ajit imparts on the episode.  Other topics covered by the trio include an in-depth explanation of how Confluent Cloud achieves elasticity under the hood (separate control and data planes, along with some Kafka dogfooding), future plans for autoscaling elasticity, scenarios where elasticity is critical, and much more.EPISODE LINKSHow to Elastically Scale Apache Kafka Clusters on Confluent CloudShrink a Dedicated Kafka Cluster in Confluent CloudElastic Apache Kafka Clusters in Confluent CloudWatch the video version of this podcastKris Jenkins’ TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)   

Om Podcasten

Streaming Audio features all things Apache Kafka®, Confluent, real-time data, and the cloud. We cover frequently asked questions, best practices, and use cases from the Kafka community—from Kafka connectors and distributed systems, to data mesh, data integration, modern data architectures, and data mesh built with Confluent and cloud Kafka as a service. Join our hosts as they stream through a series of interviews, stories, and use cases with guests from the data streaming industry. Apache®️, Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.