Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo

The Data Flowcast Podcasts

show episodes
 
Welcome to The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI— the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward. Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workflow management system to meet the ever-evolving needs of data engineering and AI ecosystems. Podcast Webpage: https://www.astronomer.io ...
  continue reading
 
Loading …
show series
 
Contributing to open-source projects can be daunting, but it can also unlock unexpected innovation. This episode showcases how one engineer’s journey with Apache Airflow led to impactful UI enhancements and infrastructure solutions at scale. Shubham Raj, Software Engineer II at Cloudera, shares how his contributions helped shape Airflow 3.0, includ…
  continue reading
 
Managing data pipelines at scale is not just a technical challenge. It is also an organizational one. At Lyft, success means empowering dozens of teams to build with autonomy while enforcing governance and best practices across thousands of workflows. In this episode, we speak with Yunhao Qing, Software Engineer at Lyft, about building a governed d…
  continue reading
 
Understanding the complexities of Apache Airflow can be daunting for newcomers and seasoned data engineers. But with the right guidance, mastering the tool becomes an achievable milestone. In this episode, Marc Lamberti, Head of Customer Education at Astronomer, joins us to share his journey from Udemy instructor to driving education at Astronomer,…
  continue reading
 
The flexibility of Airflow plays a pivotal role in enabling decentralized data architectures and empowering cross-functional teams. In this episode, we speak with Alberto Crespi, Data Architect at lastminute.com, who shares how his team scales Airflow across 12 teams while supporting both vertical and horizontal structures under a data mesh approac…
  continue reading
 
Innovation in orchestration is redefining how engineers approach both traditional ETL pipelines and emerging AI workloads. Understanding how to harness Airflow’s flexibility and observability is essential for teams navigating today’s evolving data landscape. In this episode, Anu Pabla, Principal Engineer at The ODP Corporation, joins us to discuss …
  continue reading
 
The orchestration layer is foundational to building robust AI- and ML-powered data pipelines, especially in complex hybrid enterprise environments. IBM’s partnership with Astronomer reflects a strategic alignment to simplify and scale Airflow-based workflows across industries. In this episode, we’re joined by IBM’s Senior Product Manager, BJ Adesoj…
  continue reading
 
Efficient orchestration and maintainability are crucial for data engineering at scale. Gil Reich, Data Developer for Data Science at Wix, shares how his team reduced code duplication, standardized pipelines, and improved Airflow task orchestration using a Python-based framework built within the data science team. In this episode, Gil explains how t…
  continue reading
 
Legacy architecture and AI workloads pose unique challenges at scale, especially in a global enterprise with complex data systems. In this episode, we explore strategies to proactively monitor and optimize pipelines while minimizing downstream failures. Adonis Castillo Cordero, Senior Automation Manager at Procter & Gamble, joins us to share action…
  continue reading
 
Building reliable data pipelines starts with maintaining strong data quality standards and creating efficient systems for auditing, publishing and monitoring. In this episode, we explore the real-world patterns and best practices for ensuring data pipelines stay accurate, scalable and trustworthy. Joseph Machado, Senior Data Engineer at Netflix, jo…
  continue reading
 
Creating consistency across data pipelines is critical for scaling engineering teams and ensuring long-term maintainability. In this episode, Snir Israeli, Senior Data Engineer at Next Insurance, shares how enforcing coding standards and investing in developer experience transformed their approach to data engineering. He explains how implementing a…
  continue reading
 
Airflow’s adaptability is driving Tekmetric’s ability to unify complex data workflows, deliver accurate insights and support both internal operations and customer-facing services — all within a rapidly growing startup environment. In this episode, Ipsa Trivedi, Lead Data Engineer at Tekmetric, shares how her team is standardizing pipelines while su…
  continue reading
 
The Airflow 3.0 release marks a significant leap forward in modern data orchestration, introducing architectural upgrades that improve scalability, flexibility and long-term maintainability. In this episode, we welcome Vikram Koka, Chief Strategy Officer at Astronomer, and Jed Cunningham, Principal Software Engineer at Astronomer, to discuss the ar…
  continue reading
 
The evolution of data orchestration at Instacart highlights the journey from fragmented systems to robust, standardized infrastructure. This transformation has enabled scalability, reliability and democratization of tools for diverse user personas. In this episode, we’re joined by Anant Agarwal, Software Engineer at Instacart, who shares insights i…
  continue reading
 
Data orchestration at scale presents unique challenges, especially when aiming for flexibility and efficiency across cloud environments. Choosing the right tools and frameworks can make all the difference. In this episode, Raviteja Tholupunoori, Senior Engineer at Deloitte Digital, joins us to explore how Airflow enhances orchestration, scalability…
  continue reading
 
The 2025 State of Airflow report sheds light on how global users are adopting, evolving and innovating with Apache Airflow. With over 5,000 responses from 116 countries, the survey reveals critical insights into Airflows’ role in business operations, new use cases and what’s ahead for the community. In this episode, Tamara Fingerlin, Developer Advo…
  continue reading
 
The orchestration layer is evolving into a critical component of the modern data stack. Understanding its role in DataOps is key to optimizing workflows, improving reliability and reducing complexity. In this episode, Andy Byron, CEO at Astronomer, discusses the rapid growth of Apache Airflow, the increasing importance of orchestration and how Astr…
  continue reading
 
The security of open-source software is a growing concern, especially as dependencies and regulations become more complex, making it essential to understand how to manage software supply chains effectively. In this episode, we sit down with Michael Winser, Co-Founder at Alpha-Omega and Security Strategy Ambassador at Eclipse Foundation, and Jarek P…
  continue reading
 
Machine learning is changing fast, and companies need better tools to handle AI workloads. The right infrastructure helps data scientists focus on solving problems instead of managing complex systems. In this episode, we talk with Savin Goyal, Co-Founder and CTO at Outerbounds, about building ML infrastructure, how orchestration makes workflows eas…
  continue reading
 
Keeping data pipelines reliable at scale requires more than just the right tools — it demands constant innovation. In this episode, Nick Bilozerov, Senior Data Engineer at Stripe, and Sharadh Krishnamurthy, Engineering Manager at Stripe, discuss how Stripe customizes Airflow for its needs, the evolution of its data orchestration framework and the t…
  continue reading
 
Turning complex datasets into meaningful analysis requires robust data infrastructure and seamless orchestration. In this episode, we’re joined by Jennifer Melot, Technical Lead at the Center for Security and Emerging Technology (CSET) at Georgetown University, to explore how Airflow powers data-driven insights in technology policy research. Jennif…
  continue reading
 
Data orchestration is evolving rapidly, with dynamic workflows becoming the cornerstone of modern data engineering. In this episode, we are joined by Samyak Jain, Senior Software Engineer - Big Data at 99acres.com. Samyak shares insights from his journey with Apache Airflow, exploring how his team built a self-service platform that enables non-tech…
  continue reading
 
Testing autonomous vehicles demands precision, scalability and powerful orchestration tools — enter Apache Airflow, a key component of Bosch’s cutting-edge testing framework. In this episode, we sit down with Jens Scheffler, Test Execution Cluster Technical Architect, and Christian Schilling, Product Owner Open Loop Testing Automated Driving, both …
  continue reading
 
Scaling a data orchestration platform to manage thousands of tasks daily demands innovative solutions and strategic problem-solving. In this episode, we explore the complexities of scaling Airflow and the challenges of orchestrating thousands of tasks in dynamic data environments. Jonathan Rainer, Former Platform Engineer at Monzo Bank, joins us to…
  continue reading
 
The future of data engineering lies in seamless orchestration and automation. In this episode, Arjun Anandkumar, Data Engineer at Telia, shares how his team uses Airflow to drive analytics and AI workflows. He highlights the challenges of scaling data platforms and how adopting best practices can simplify complex processes for teams across the orga…
  continue reading
 
Transforming bottlenecked finance processes into streamlined, automated systems requires the right tools and a forward-thinking approach. In this episode, Mihir Samant, Senior Data Analyst at Etraveli Group, joins us to share how his team leverages Airflow to revolutionize finance automation. With extensive experience in data workflows and a passio…
  continue reading
 
Loading …
Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play