Talk abstract: One-by-one Is No Fun: Lessons learned writing Kafka ETL jobs
I’ve been writing ETL jobs using Kafka for a couple of years now. In that time, I’ve done just about everything wrong, before figuring out what does work. This talk will cover:
-What Kafka is
-What the major frameworks are, and how they steer you towards one-by-one message processing
-Why you shouldn’t do that, including performance measurements for different methods of loading data into a Postgres data warehouse
-How to avoid on-by-one processing
Bio: Dean is a Data Engineer originally from Vancouver, now living in San Francisco. His general fussiness and paranoia make him suited to the hairball which is the data world. He’s especially interested in writing metadata-driven ETL systems. He uses much of his spare time to rock climb, and is planning to sneak away to the Peak District while in the UK, so if you have any good info on the area, track him down!