Discover the latest achievements and current challenges in Big Data!
Discover the latest achievements and current challenges in Big Data!
Check out the recording of BDTWS webinar top experts!
One does not simply upgrade Airflow. 1.10 -> 2.4 case study
Apache Airflow is the core of many data warehouses. It's a very mature technology, proven to be stable and flexible, a swiss-army knife of Data Engineers. In 2021 project bumped the major version from 1.X to 2.X, modifying all the components, making the upgrade process not trivial, especially if company-critical processes run there. And that's exactly the challenge we faced recently with one of the major Getindata clients. The legacy Airflow 1.10.5, extended with many plugins and custom operators was controlled by ansible and running on VMs. We not only bumped the version, but also moved components to run entirely on Kubernetes. During the webinar I'd like to share with you the challenges we faced, what did we do to mitigate the risks and what didn't go exactly as planned If you're still on Airflow 1.X, the content will help you to transition smoothly. But even if you're not Airflow 1.X user, I will convince you why upgrading components as often as possible is the key to the success.
Analyze your data at the speed of light with Polars and Kedro
The pandas library is one of the key factors that enabled the growth of Python in the Data Science industry and continues to help data scientists thrive almost 15 years after its creation. Because of this success, nowadays several open-source projects claim to improve pandas in various ways. Polars is one of those new dataframe libraries: it’s backed by Arrow and Rust, and offers an expressive API for dataframe manipulation with excellent performance. In this webinar I will show you how to combine Polars for your data manipulation needs with Kedro, a data science framework that will help you write more maintainable code.