Tame Kafka with Clojure

@andreacrotti

Created: 2018-04-02 Mon 15:17

TOC

  • What is Kakfa
  • Why Kafka
  • Kafka and Clojure
  • Demo time
  • Real world scenario

Kafka

  • distributed append only immutable log
  • extremely scalable (constant time)
  • >250k SLOC (Java/Scala)
  • Zookeeper for distribution
  • open sourced by LinkedIn in 2011
  • just recently released 1.0

Components

  • Publisher API
  • Consumer API
  • Stream API
  • Connector API

Log anatomy

This is an example of how partitions look like, so each partition is simply an ordered, immutable sequence of records that is continually appended to.

Each record has assigned an offset in the form of a sequential id, that it's used to reference to it univocally.

An important thing to keep in mind is that Kafka will store all your records forever, but it will enforce a retention period, which is the number of days something is available to be consumed. This allows to have constant time performances independently by the amount of total data currently stored.

#+END_NOTES

Log producer

Log consumer

Streaming

Why Kafka

  • Scalability
  • Data integrity
  • Auditing
  • Proper Microservices

Data dichotomy