Real-Time Stream Processing

 

In this one day workshop you will learn how to process unbounded streams of data in real-time using popular open-source frameworks. We focus mostly on Apache Flink and Apache Kafka – the most promising open-source stream processing framework that is more and more frequently used in production.

During the course we simulate real-world end-to-end scenario – processing logs generated by users interacting with a mobile application in real-time. The technologies that we use include Kafka and Flink. All exercises will be done using either a local docker environment or within your IDE.

 

   Target Audience

Data engineers who are interested in leveraging large-scale and distributed tools to process streams of data in real-time.

 

    Requirements

Some experience coding in Java or Scala and basic familiarity with Big Data tools (HDFS, YARN).

 

    Participant’s ROI

  • Concise and practical knowledge of applying stream processing to solve business problems.
  • Hands-on coding experience under supervision of experienced Flink engineers.
  • Tips about real world applications and best practices.

 

    Training Materials

All participants will get training materials in the form of PDF files containing slides with theory and exercise manual with the detailed description of all exercises. During the workshops the exercises can be done using either a local docker environment or within your IDE.

 

    Time Box

This is a one-day event, there will be some breaks between sessions.

 

    Agenda

Session #1 - Introduction to Apache Kafka

Session #2 - Introduction to Apache Flink

  • Key concepts behind stream processing
  • Building a streaming pipeline with Flink DataStream API
  • Hands-on exercises

Session #3 - Timely Stream Processing

  • Flink’s notions of time, windowing and aggregations
  • Hands-on exercises

Session #4 - Connecting to the external world

  • Integration with Apache Kafka
  • Hands-on exercises

Session #5 - Stateful Stream Processing

  • Fault tolerance
  • Advanced time handling
  • Stateful operations
  • Hands-on exercises

Session #6  - Summary and comparison with other stream processing engines

 

Keywords: Kafka, Flink, Real Time Processing, Low Latency Stream Processing

 

    Session leader:

Data Engineer
GetInData
Software developer
GetInData

BIG DATA TECHNOLOGY
WARSAW SUMMIT

ORGANIZER

Evention sp. z o.o

Rondo ONZ 1 Str,

Warsaw, Poland

www.evention.pl

CONTACT

Weronika Warpas