Otto Case Study: Enterprise Data Platform Migration

Otto's data teams were drowning in "YAML Hell" with Argo Workflows while their Airflow users couldn't even test locally without spinning up production environments. Here's how Germany's largest e-commerce company unified 6 teams under one platform and transformed their entire approach to data pipeline development.

Key results:

6 teams migrated from multiple legacy orchestrators to Dagster
Improved data pipeline stability and monitoring
Reduced time to identify and resolve failures
Monolithic pipelines transformed into reusable, maintainable components
Better “community feeling” and increased collaboration between teams

Challenges When Operating at Scale

As one of Germany’s largest e-commerce companies, Otto operates at massive scale. Their data teams are responsible for processing millions of transactions and managing complex forecasting models that predict everything from customer demand to the timing of warehouse inventory arrivals. As the company grew, however, Otto’s data pipeline infrastructure had evolved to be a patchwork of multiple teams across the organization, each with their own tools and approaches.

Some of the company's data teams were using Argo Workflows, and found themselves trapped in endless configuration files that grew painfully cumbersome as their pipelines grew more sophisticated. "Argo is not written in a programming language — you use it by configuring YAMLs," explains Heiner Hippke, a senior machine learning specialist who leads Otto's forecasting team. "So we had these huge Argo YAMLs that comprised complete pipelines which we referred to as ‘YAML hell.’ Argo was just not flexible enough for our needs.”

Meanwhile, other teams using Apache Airflow (via Google Cloud Composer) faced different but equally challenging data pipeline issues. The infrastructure overhead was significant, and there was no local development or testing. "When there was a problem we had to spin up the production environment to see if it was even running or not, which took a lot of time," says Christian Kalla, a platform engineer at Otto. The lack of local development capabilities meant slower iteration cycles and higher costs while teams wasted valuable time managing operational overhead and struggling to maintain complex data pipelines. These inefficiencies weren't just inconvenient for the technical teams, though — they were also limiting the company’s ability to respond quickly to market changes.

The deciding factor came when Otto’s existing patchwork data architecture couldn’t handle the complex ML orchestration patterns that drive the company’s business-critical forecasting models, which are retrained every day. “The pain must be really huge to get everyone together to make the decision, let's go for a new data pipeline tool,” Heiner says.

The Results

Otto's migration to Dagster has transformed both technical capabilities and team dynamics across the company’s six different data teams. The shift from highly manual, configuration-heavy tools has delivered immediate productivity gains, thanks in no small part to local development and testing capabilities. "Debugging in case of failure is much easier now," says Heiner, who also praises Dagster’s easy integration with monitoring systems and Slack/Teams for issue alerting and Dagster’s Python-native development as another major improvement to the developer experience. Ultimately, he says, Otto's teams can now iterate faster and resolve issues more quickly.

Perhaps even more importantly, Dagster’s data-centric architecture allows teams to center their work on actual data assets rather than tasks, breaking down silos between data engineers and data scientists at Otto. "Assets are the mind-changing abstraction,” says control systems engineer Tobias Krauss. “Two worlds together, stakeholders and engineers meet in the middle." Teams are now able to share and reuse assets across the organization while maintaining platform-wide visibility with Dagster’s unified catalog and lineage capabilities, resulting in, Tobias says, “Better community feeling and collaborative energy between teams”.

The teams have also achieved significant architectural improvements, moving from monolithic pipeline configurations to modular, reusable components. Otto has built sophisticated “asset factories” for common patterns like data transfers from Oracle to Google Cloud Storage to BigQuery, dramatically reducing redundant development work across teams.

Looking ahead

Otto's success with Dagster has created momentum for broader organizational adoption. Currently, Dagster usage is concentrated within their business intelligence teams, but the platform engineering team sees significant opportunity to expand across the entire organization.

The team is also focused on standardizing data pipelines organization-wide by making Dagster the default choice for all new data projects. "I'd love to reduce redundancy with Dagster, with more teams using shared asset factories and even reducing the need to transfer data oftentimes," says Heiner Hippke. With their foundational infrastructure now in place, Otto is positioned to scale their data platform efficiently while maintaining the developer-friendly experience that made their initial adoption successful.

Key takeaways

Improved developer experience: Eliminated "YAML Hell" with Python-native development, enabling local testing and faster debugging
Cross-functional collaboration: Data scientists can now contribute directly to data pipelines alongside data engineers
Modular architecture: Broke down monolithic pipelines into reusable, maintainable components using asset factories
Faster time to production: Reduced development cycles through local testing capabilities and a unified control plane for data
Organizational scalability: Successfully onboarded multiple teams with an embedded enablement approach, creating a foundation for company-wide adoption
Infrastructure standardization: Built reusable Terraform modules and shared patterns that teams can adopt quickly
Increased pipeline reliability: Improved monitoring and alerting capabilities with seamless integration to Teams/Slack notifications

‍

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.