Customers
From "YAML Mess" to Unified Control: How Otto Transformed Their Data Platform with Dagster

From "YAML Mess" to Unified Control: How Otto Transformed Their Data Platform with Dagster

July 28, 2025
From "YAML Mess" to Unified Control: How Otto Transformed Their Data Platform with Dagster

Otto's data teams were drowning in "YAML Hell" with Argo Workflows while their Airflow users couldn't even test locally without spinning up production environments. Here's how Germany's largest e-commerce company unified 6 teams under one platform and transformed their entire approach to data pipeline development.

Key results:

  • 6 teams migrated from multiple legacy orchestrators to Dagster
  • Improved data pipeline stability and monitoring
  • Reduced time to identify and resolve failures
  • Monolithic pipelines transformed into reusable, maintainable components
  • Better “community feeling” and increased collaboration between teams

Challenges When Operating at Scale

As one of Germany’s largest e-commerce companies, Otto operates at massive scale. Their data teams are responsible for processing millions of transactions and managing complex forecasting models that predict everything from customer demand to the timing of warehouse inventory arrivals. As the company grew, however, Otto’s data pipeline infrastructure had evolved to be a patchwork of multiple teams across the organization, each with their own tools and approaches. 

Some of the company's data teams were using Argo Workflows, and found themselves trapped in endless configuration files that grew painfully cumbersome as their pipelines grew more sophisticated. "Argo is not written in a programming language — you use it by configuring YAMLs," explains Heiner Hippke, a senior machine learning specialist who leads Otto's forecasting team. "So we had these huge Argo YAMLs that comprised complete pipelines which we referred to as ‘YAML hell.’ Argo was just not flexible enough for our needs.”

Meanwhile, other teams using Apache Airflow (via Google Cloud Composer) faced different but equally challenging data pipeline issues. The infrastructure overhead was significant, and there was no local development or testing. "When there was a problem we had to spin up the production environment to see if it was even running or not, which took a lot of time," says Christian Kalla, a platform engineer at Otto. The lack of local development capabilities meant slower iteration cycles and higher costs while teams wasted valuable time managing operational overhead and struggling to maintain complex data pipelines. These inefficiencies weren't just inconvenient for the technical teams, though — they were also  limiting the company’s ability to respond quickly to market changes.

The deciding factor came when Otto’s existing patchwork data architecture couldn’t handle the complex ML orchestration patterns that drive the company’s business-critical forecasting models, which are retrained every day. “The pain must be really huge to get everyone together to make the decision, let's go for a new data pipeline tool,” Heiner says. 

The Results

Otto's migration to Dagster has transformed both technical capabilities and team dynamics across the company’s six different data teams. The shift from highly manual, configuration-heavy tools has delivered immediate productivity gains, thanks in no small part to local development and testing capabilities. "Debugging in case of failure is much easier now," says Heiner, who also praises Dagster’s easy integration with monitoring systems and Slack/Teams for issue alerting and Dagster’s Python-native development as another major improvement to the developer experience.  Ultimately, he says, Otto's teams can now iterate faster and resolve issues more quickly.

Perhaps even more importantly, Dagster’s data-centric architecture allows teams to center their work on actual data assets rather than tasks, breaking down silos between data engineers and data scientists at Otto. "Assets are the mind-changing abstraction,” says control systems engineer Tobias Krauss. “Two worlds together, stakeholders and engineers meet in the middle." Teams are now able to share and reuse assets across the organization while maintaining platform-wide visibility with Dagster’s unified catalog and lineage capabilities, resulting in, Tobias says, “Better community feeling and collaborative energy between teams”. 

The teams have also achieved significant architectural improvements, moving from monolithic pipeline configurations to modular, reusable components. Otto has built sophisticated “asset factories” for common patterns like data transfers from Oracle to Google Cloud Storage to BigQuery, dramatically reducing redundant development work across teams. 

Looking ahead

Otto's success with Dagster has created momentum for broader organizational adoption. Currently, Dagster usage is concentrated within their business intelligence teams, but the platform engineering team sees significant opportunity to expand across the entire organization. 

The team is also focused on standardizing data pipelines organization-wide by making Dagster the default choice for all new data projects. "I'd love to reduce redundancy with Dagster, with more teams using shared asset factories and even reducing the need to transfer data oftentimes," says Heiner Hippke. With their foundational infrastructure now in place, Otto is positioned to scale their data platform efficiently while maintaining the developer-friendly experience that made their initial adoption successful.

Key takeaways

  • Improved developer experience: Eliminated "YAML Hell" with Python-native development, enabling local testing and faster debugging
  • Cross-functional collaboration: Data scientists can now contribute directly to data pipelines alongside data engineers
  • Modular architecture: Broke down monolithic pipelines into reusable, maintainable components using asset factories
  • Faster time to production: Reduced development cycles through local testing capabilities and a unified control plane for data
  • Organizational scalability: Successfully onboarded multiple teams with an embedded enablement approach, creating a foundation for company-wide adoption
  • Infrastructure standardization: Built reusable Terraform modules and shared patterns that teams can adopt quickly
  • Increased pipeline reliability: Improved monitoring and alerting capabilities with seamless integration to Teams/Slack notifications

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

Unlocking the Full Value of Your Databricks
Unlocking the Full Value of Your Databricks
Blog

March 12, 2026

Unlocking the Full Value of Your Databricks

Standardizing on Databricks is a smart strategic move, but consolidation alone does not create a working operating model across teams, tools, and downstream systems. By pairing Databricks and Unity Catalog with Dagster, enterprises can add the coordination layer needed for dependency visibility, end-to-end lineage, and faster, more confident delivery at scale.

Announcing AI Driven Data Engineering
Announcing AI Driven Data Engineering
Blog

March 5, 2026

Announcing AI Driven Data Engineering

AI coding agents are changing how data engineers work. This Dagster University course shows how to build a production-ready ELT pipeline from prompts while learning practical patterns for reliable AI-assisted development.

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.