Tuning Kafka Pipelines

Learn from real-world examples of performance tuning truly global Kafka pipelines.

About This Session

Kafka is a high-throughput, fault-tolerant, scalable platform for building high-volume near-real-time data pipelines. This presentation is about tuning Kafka pipelines for high-performance.

This presenter will begin by introducing the basic concepts of Kafka including, producer, consumer, and the broker. Kafka pipelines will be discussed next and the role of Mirror-maker in creating global data pipelines that span multiple datacenters. The presenter will then discuss select configuration parameters and deployment topologies essential to achieve higher throughput and low latency across the pipeline.

In the second part the presenter will discuss anecdotes based on the real-world experience in running Kafka at LinkedIn. Specifically, lessons learned in troubleshooting and optimizing a truly global data pipeline that replicates 100GB data under 25 minutes will be discussed.

No prior knowledge of Kafka is necessary however familiarity with publish-subscribe, big data technologies, sharding/partitioning will come in handy.

Time: 4:00 PM Saturday Room: Fireside A

The Speaker(s)

Sumant Tambe

Senior Software Engineer, LinkedIn

Kafka-Dev, Microsoft MVP, Open-Source Contributor, Blogger, Author, Father, and Gamer

Tuning Kafka Pipelines

About This Session

The Speaker(s)

Sumant Tambe

Senior Software Engineer, LinkedIn

Download