Agenda 2020

February 27, 2020


We are doing our best to create the agenda as soon as possible. We want to make it appropriate to every Participant, so we will build 5 simultaneous sessions:

8.00 - 9.00

Registration and welcome coffee

9.00 - 9.15

Conference opening

Przemysław Gamdzyk

CEO & Meeting Designer, Evention

Adam Kawa

CEO and Co-founder, GetInData

9.15 - 10.45

Plenary Session

10.45 - 11.15

Coffee break

11.15 – 15.30 Simultaneous sessions

Architecture, Operations and Cloud

This track is dedicated to architects, administrators and experts with DevOps skills who are interested in technologies, techniques and best practices for planning, building, installing, managing, containerising and securing their Big Data infrastructure in enterprise environments – both on-premise and the cloud.

Data Engineering



This track is the place for engineers to learn about tools, techniques and battle-proven solutions to collect, store and process large amounts of data. It covers topics like data collection and ingestion, ETL, job scheduling, metadata and schema management, distributed processing engines, distributed datastores and more.

Streaming and Real-Time Analytics


This track covers technologies, strategies and valid use-cases for building streaming systems and implementing real-time applications that enable actionable insights and interactions not previously possible with classic batch systems. This includes solutions for data stream ingestion and applying various real-time algorithms and machine learning models to process events coming from IoT sensors, devices, front-end applications and users.

Artificial Intelligence and Data Science

This track includes real-world case studies demonstrating how data & technology are used together to address a wide range of complex problems in the domain of machine learning, artificial intelligence and data science. It also covers topics related to prototyping and operationalizing ML/AI models in production, data visualisation and A/B tests.

Data Strategy and ROI


This track is a quite new thing at BigData Techology Warsaw Summit. It is aimed at data and business professionals who are interested in learning how data and analytics can be used to generate growth, value added, and positive financial impact. It will contain presentations about real-world use cases that cover useful data-focused solutions, new business models and various data monetization strategies. Since most of the Big Data projects are difficult to complete on time and in budget, presentations will also explain necessary technical, cultural and leadership aspects that are key to successful Big Data initiatives at enterprises – also during the phase of proper preparation and ensuring longer term sustainability of the project – avoiding wasting money and getting a positive return on investment (ROI).

11.15 - 11.45

To be confirmed

11.15 - 11.45

To be confirmed

11.15 - 11.45

Creating an extensible Big Data Platform for advanced analytics - 100s of PetaBytes with Realtime access

Reza Shiftehfar

Engineering Management & Leadership, Uber

11.15 - 11.45

Building Recommendation Platform for ESPN+ and Disney+. Lessons Learned

Keywords: #recommendersystems  #ML #cloud #experimentation

Grzegorz Puchawski

Data Science and Recommendation, Disney Streaming Services

11.15 - 11.45

To be confirmed

11.45 - 11.50

Technical break

11.50 - 12.20

Replication Is Not Enough for 450 PB: Try an Extra DC and a Cold Store

Keywords: #Hadoop #datasecurity #resilience #in-house #storage

Stuart Pook

Senior Site Reliability Engineer, Criteo

11.50 - 12.20

Data Platform at Bolt: challenges of scaling data infrastructure in a hyper growth startup

Keywords:  #aws #datalake #datawarehouse #preprocessing #machinelearning

Łukasz Grądzki

Engineering Manager, Bolt

11.50 - 12.20

Interactive Analytics at Alibaba

Yuan Jiang

Senior Staff Engineer, Alibaba

11.50 - 12.20

Building a Factory for Machine Learning at Spotify

Josh Baer

Product Lead, Machine Learning Platform, Spotify

11.50 - 12.20

The Big Data Bento: Diversified yet Unified

Keywords: #bigdatabento #cloud #unifiedanalyticsplatform #unifieddataanalyticsplatform #spark

Paulo Gutierrez

Solutions Architect, Databricks

12.20 - 12.25

Technical break

12.25 - 12.55

To be confirmed

12.25 - 12.55

To be confirmed

12.25 - 12.55

Adventure in Complex Event Processing at telco


Jakub Błachucki

Big Data Engineer, Orange

Maciej Czyżowicz

Technical Leader for Analytics Stream, Orange

Paweł Pinkos

Big Data Engineer, Orange

12.25 - 12.55

To be confirmed

12.25 - 12.55

To be confirmed

12.55 - 13.50


13.50 - 14.20

To be confirmed

13.50 - 14.20

Presto @ Zalando: A cloud journey for Europe’s leading online retailer

Keywords:  #CloudAnalytics #Presto #DataVirtualization #SQL-on-Hadoop #DWH

Wojciech Biela

Co-founder & Senior Director of Engineering, Starburst

Max Schultze

Data Engineer, Zalando SE

13.50 - 14.20

To be confirmed

13.50 - 14.20

To be confirmed

13.50 - 14.20

Omnichannel Personalization as example of creating data ROI - from separate use cases to operational complete data ecosystem

Keywords:  #ROI #real-timeomnichannelpersonalization #scalingdataecosystem #businessengagement #harvesting

Tomasz Burzyński

Business Insights Director, Orange

Mateusz Krawczyk

Personalization Solutions Product Owner, Orange

14.20 - 14.25


14.25 - 14.55

DevOps best practices in AWS cloud (Big Data stack)

Presentation will concentrate on three topics: Distributed data processing, costs optimisation and security. In particular we are going to introduce EMR as big data service, SageMaker for model training, LakeFormation for managing granular access to data. All of that developed as infrastructure as a code in secure environment, automated with best devops practices and keeping within reasonable costs inside AWS cloud. You will know how to control expenses and pay only for well utilised services in serverless environment.

Keywords:  #aws_cloud #devops #best_practices #infrastructure_as_a_code

Adam Kurowski

Senior DevOps, StepStone Services

Kamil Szkoda

DevOps Team Leader and Product Owner , StepStone Services

14.25 - 14.55

To be confirmed

14.25 - 14.55

Monitoring & Analysing Communication and Trade Events as Graphs

Keywords:  #graphAnalytics #transactionProcessing #FlinkGelly #Elasticsearch #Kibana

Christos Hadjinikolis

Senior Consultant, Lead ML Engineer, Data Reply UK

14.25 - 14.55

Neural Machine Translation: achievements, challenges and the way forward

Keywords:  #machinetranslation #deeplearning #adversarialexamples #datascience

Katarzyna Pakulska

Data Science Technology Leader, Findwise

Barbara Rychalska

Senior Data Scientist and Data Science Section Leader, Findwise

14.25 - 14.55

Data Science @ PMI – Journey from business problem to the data product industrialization

Keywords:  #UseCase #CI/CD #BestPracticesforData Science #DataProduct #Reproducibleresearch

Michał Dyrda

Senior Enterprise Data Scientist, Philip Morris International

Maciej Marek

Enterprise Data Scientist , Philip Morris International

14.55 - 15.00


15.00 - 15.30

How to send 16,000 servers to the cloud in 8 months?

Keywords:   #Openx #gcp #scale #adtech #migration

Marcin Mierzejewski

Engineering Director, OpenX

Radek Stankiewicz

Strategic Cloud Engineer, Google Cloud

15.00 - 15.30

Optimize your Data Pipeline without Rewriting it

Keywords:  #data-driven #optimize #data-pipeline #operation #improvement

Magnus Runesson

Senior Data Engineer, Tink

15.00 - 15.30

Flink on a trip - a real-time car insurance system in a nut(shell)

Wojciech Indyk

Streaming Analytics and All Things Data Black Belt Ninja,

15.00 - 15.30

Reliability in ML - how to manage changes in data science projects?

Keywords: #datascience #datamanagement #revisioncontrol #datapipeline

Kornel Skałkowski

Senior AI Engineer, Consonance Solutions

14.25 - 14.55

Using data to build Products

 It’s quite challenging to get ideas for new products and build them from scratch. I will be sharing my experiences on how data and machine learning helped us at in finding what to build from scratch to solve a user problem with efficiency and scalability.

Ketan Gupta

Product Leader,

15.30 - 16.00

Coffee break

16.00 – 17.25 Roundtables sessions

16.00 - 16.05


Parallel roundtables discussions are the part of the conference that engage all participants. It has few purposes. First of all, participants have the opportunity to exchange their opinions and experiences about specific issue that is important to that group. Secondly, participants can meet and talk with the leader/host of the roundtable discussion – they are selected professionals with a vast knowledge and experience.

There will be 2 rounds of discussion, hence every conference participants can take part in 2 discussions


16.05 – 16.45    1st round

16.50 – 17.30    2nd round

 Snorkel Beambell – Real-time Weak Supervision on Apache Beam 

Suneel Marthi

Principal Technologist - AI/ML, Amazon Web Services

Real-life machine learning at scale using Kubernetes and Kubeflow

How to build a machine learning pipeline to process 1500 TB data daily in a fast and cost-effective way on Google Cloud Platform using Kubeflow? How to serve TensorFlow model with almost 1M requests per second and latency < 10ms on Kubernetes? Is Kubernetes and Kubeflow ready to serve data scientists?

Michał Bryś

Data scientist, OpenX

Michał Żyliński

Customer Engineer, Google

Managing workflows at scale

How to build and maintain thousands of pipelines in the organisation? What are the biggest pain points in orchestrating hundreds of ETLs? What open source and managed solutions are available?

Paweł Kupidura

Data Engineer, Bolt

Practical application of AI

Industry 4.0 and AI – are we ready for the 4th industrial transformation? Who should be the beneficiary of the Industry 4.0? Key barriers to the implementation of AI projects in organizations. Real cases of AI in Industry.

Natalia Szóstak

Head of R&D, TIDK

The need for explainable AI
With the spread of AI-based solutions, more and more organizations would like to understand the reasons for system decisions. It’s especially interested in regulated industries. The session will cover so-called white-box methods, as well as modern approaches to AI explainability which allow understanding more complex models.

Kacper Łukawski

Data Engineer and Tech Lead, Codete

A new reference point for natural language processing

Sebastiano Galazzo


Deploying ML models in real-time using stream processors

Andrea Spina

Team Lead Data Engineer, Radicalbit

Data discovery – building trust around your data


  • Challenges of building a modern & future-proof data processing platform
  • AI, Machine Learning and Big Data in the enterprise
  • Data visualization – how to visualize large, complex and dirty data and what tools to use
  • Choosing a right BI solution for a large data and a quick response time
  • Analytics and Customer Experience Management on top of Big Data
  • IoT in production – use-cases, data, tools and challenges
  • Stream processing engines – features, performance, comparison
  • Being an efficient Data Engineer. Tools, skills and ways of learning
  • Data privacy, personal integrity and GDPR
  • Data ingestion technologies, techniques and challenges
  • Big Data – the cloud way
  • From on-premise to cloud: an end to end cloud migration journey
  • Big Data on Kubernetes

17.30 - 17.45

Coffee break

17.45 - 18.15

Panel discussion


Adam Kawa

CEO and Co-founder, GetInData

18.15 - 18.30

Closing & Summary

Przemysław Gamdzyk

CEO & Meeting Designer, Evention

19.00 - 22.00

Networking party for all participants and speakers

At the end of the conference we would like to invite all the attendees for the informal evening meeting at “Dekada” Club , which is located at the Grójecka 19/25, 02-021 Warszawa.