Skip to main content


Showing posts from June, 2020

Kafka: Introduction to core concepts

Apache Kafka was developed by LinkedIn and donated to Apache. Apache Kafka is a distributed streaming platform that can handle high volume of data. Pull or Push? I initially misunderstood Kafka as a push based messaging system. However Kafka has chosen traditional pull approach. In Kafka, data is pushed to the broker by producers and pulled from the broker by the consumers. IMAGE1 Why Kafka? Kafka is a reliable messaging system which is fast and durable. We can list it's benifits as; Scalable - Kafka's partion model allows data to distributed across multipel servers, making it highly scalable.  Durable - Kafka's data is written to disk making it highly durable agaisnt server failures. Multiple producers - Kafka can handle multpile producers which publish to the same topic. Multiple consumers - Kafka is designed so that multipel consumers can read messages without interfering with each other. High performance - All these features allows high performace distributed messaging

Getting started with Kafka

This is a quick guide to set up Kafka environment for local development and learn Kafka. I use Kafka Udemy course and documentation for this. Setup and configuring Kafka echo system may be a boring task. However with Docker and Landoop (now they are Lenses) it is as easy as running a docker command. Note: You need Docker installed to follow this post. Get landoop docker image When you have setup docker in your environment you can pull landoop docker image. $ docker pull landoop/fast-data-dev Start the Kafka broker I'm going to run the docker container in interactive mode. $ docker container run --rm --name my-kafka-broker -it \ -p 2181:2181 -p 3030:3030 -p 8081:8081 \ -p 8082:8082 -p 8083:8083 -p 9092:9092 \ -e ADV_HOST= \ landoop/fast-data-dev This will bring up all necessary tools to work with Kafka. After about 1 minute you can access landoop's UI console from . If you scroll down, you can see r