Modern data pipelines with dbt

 

In this one day workshop you will learn how to create modern data transformation pipelines managed by dbt. Discover how you can improve your pipelines’ quality and workflow of your data team by introducing a tool aimed to standardize the way you incorporate good practices within the data team: version controlling, testing, monitoring, snapshotting and standardization of writing SQL transformation scripts. We will work through typical data transformation problems you can encounter on a journey to deliver fresh & reliable data and how dbt can help to solve them.

   Target Audience

Data analysts, analytics engineers & data engineers, who are interested in learning to build and deploy data transformations workflows faster. Data analysts who would like to leverage their SQL skills and start working on data transformation tasks.

 

    Requirements

  1. SQL fluency - ability to write data transforming queries.
  2. Basic understanding of ETL processes.
  3. Basic experience with command-line
  4. Laptop with stable internet connection (participants will connect to Jupyter Notebooks  pre-created on Google Cloud Platform)

 

    Participant’s ROI

  • Concise and practical knowledge of applying dbt to solve typical problems with data pipelines in a modern way: managing run sequence, data quality issues, monitoring, fluency of switching between environments
  • Hands-on coding experience under supervision of Data Engineers experienced in maintaining dbt pipelines
  • Tips about real world applications and best practices.

 

    Training Materials

All participants will get training materials in the form of PDF files containing slides with theory and exercise manual with the detailed description of all exercises. During the workshops participants will follow a shared step-by-step guideline with an overview from the perspective of augmenting the Data Team workflow with the dbt tool. Jupyter Notebook environments will be provided for each participant.

 

    Time Box

This is a one-day event (9:00-16:00), there will be some breaks between sessions.

 

    Agenda

Session #1 - Introduction to dbt

  • Framework overview
  • Typical use cases
  • Impact on data transformation development

Session #2 - Core concepts of dbt

  • Data models
  • Seeds, sources
  • Tests
  • Documentation, maintenance and data lineage
  • Hands-on exercises

Session #3 - Advanced dbt features

  • Macros & hooks
  • Snapshots
  • Extensions
  • Other tools to integrate with (overview only)
  • Hands-on exercises

Session #4 -  Scheduling, deployment, workflow

  • GID DataOps Data Platform
  • Airflow
  • dbt cloud
  • Hands-on exercises

 

   Keywords: data warehouse, data analytics, dbt, ETL, ELT, data transformation, sql

 

    Session leader:

Software/Data Engineer
GetInData

BIG DATA TECHNOLOGY
WARSAW SUMMIT 2022

April 26th-28th, 2022
Let's go virtual!

ORGANIZER

Evention sp. z o.o
Rondo ONZ 1 Str,
Warsaw, Poland
www.evention.pl

CONTACT

Weronika Warpas
m: +48 570 611 811
e: weronika.warpas@evention.pl

© 2022 | This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.