Data Streaming: Analyze Your Data in Real-Time With Flink

Conference Center ADN, room 14

In this one day workshop you will learn how to build streaming analytics apps that deliver instant results in a continuous manner on data-intensive streams. You will discover how to configure streaming pipelines, transformations, aggregations or triggers using SQL and Python in an user-friendly development environment using open source tools of Apache Flink, Apache Kafka and Getindata OSS projects.

We will also teach you how to incorporate good engineering practices like version controlling, testing and monitoring your applications. We prepared for you an environment that wrap analysts workflow from designing your application to deploying it to production and that does not require you to be a software engineer. We will work through typical streaming problems you can encounter on a journey to deliver fresh & reliable data and how modern tooling can help to solve them. All hands-on exercises will be carried out in a public cloud environment (e.g. GCP or AWS) and all tools already installed and remotely accessible.

Agenda

Session #1 - Introduction to Apache Kafka

Session #2 - Introduction to Apache Flink

Key concepts behind stream processing
Building a streaming pipeline with Flink SQL
Hands-on exercises

Session #3 - Timely Stream Processing

Flink’s notions of time, windowing and aggregations
Joining multiple data streams or data sets
Hands-on exercises

Session #4 - Pattern Matching

matching patterns with MATCH_RECOGNIZE clause
Hands-on exercises

Session #5 - Productization

Deploying Flink jobs to Production
Hands-on exercises

Session #6 - Flink advanced concepts (theory only)

high-level Python Flink Table API
low-level Python Flink Datastream API
Temporal Table Functions

Data Streaming: Analyze Your Data in Real-Time With Flink

BIG DATA TECHNOLOGY
WARSAW SUMMIT

ORGANIZER

CONTACT