StashAway Unified Their Global Data Platform with Dagster

Our team is lean, but Dagster lets us handle thousands of assets at high velocity—we couldn’t have done that with any other orchestration platform.

StashAway is a digital wealth management platform guiding investment strategies for customers across five global markets. They rely on Dagster to orchestrate thousands of data assets across a complex, multi-region Kubernetes environment, enabling them to scale operations efficiently without increasing overhead.

Dagster dramatically increased our team's velocity. Tasks that previously took us one or two days in Airflow now take about an hour, enabling our team to effectively manage a complex, multi-region infrastructure. - Jonathan Phoon, Lead Data Engineer

Key Results

50% increase in SLA compliance (from unpredictable daily misses to consistent compliance)
50% data warehouse savings (migration from Redshift to Databricks, enabled by Dagster)
95% faster pipeline creation (from 1–2 days to ~1 hour with custom tools and Dagster)
Fragmented views across 5 regions → single pane of glass for end-to-end lineage, complete observability, and simplified troubleshooting

The Stashaway Data Engineering Team

Jonathan Phoon, Lead Data Engineer at Stashaway

Jonathan Phoon is a Lead Data Engineer with a passion for building robust data pipelines. A self-taught software engineer turned data specialist, he now leverages modern tools like Dagster to streamline workflows and enhance data reliability. His journey reflects a deep commitment to continuous learning and innovation in the tech space.

Bolin Zhu is a Data Engineer who thrives on turning chaos into clean, reliable pipelines. He brings clarity to complex systems—and is statistically likely to be the best badminton player in any room full of data engineers.

Improving data reliability and developer velocity

StashAway migrated from Airflow to Dagster in 2022, and they’ve been a fast adopter of Dagster’s platform innovations ever since.

From adopting software-based assets to automating pipeline creation to centralizing data operations across 5 regional instances through a unified data control plane, StashAway relies on Dagster for the reliability and scalability of its fast-growing data platform without scaling its team size.

Before Dagster, StashAway’s Airflow setup lacked basic software practices like testing environments, unit tests, or version control. With pipelines running independently in five separate global regions, troubleshooting relied entirely on manual Slack alerts, creating frequent delays and confusion about issue severity and urgency.

Phasing out inefficiencies by scaling up innovation

StashAway first chose Dagster for its unified dashboard and sophisticated alerting capabilities. Since then, they’ve relied on Dagster as a critical component in scaling their data platform to meet growing demand–without growing team size.

Adopting software-defined assets

The StashAway team initially used Dagster with dbt Cloud for data orchestration, but workloads were decoupled and required manual scheduling with unpredictable results.

When Dagster first introduced software-defined assets, StashAway eagerly adopted it as a way to accelerate developer velocity and reliability–and simplify their tooling.

Software-defined assets allow Stashaway to:

Cut costs by migrating off dbt Cloud and leveraging Dagster’s native dbt integration with auto-materialization to eliminate scheduling gaps.
Improve reliability and stability with auto-materialization and extra assurance from asset-based freshness checks, enabling them to meet their SLA targets.
Adopt sound testing practices by treating their data workflows like software development workflows.
Accelerate developer onboarding, by enabling junior team members to immediately contribute due to Dagster’s clear documentation, structured asset definitions, and community resources.

Accelerating development with CLI Automation

Lead Data Engineer Jonathan quickly took advantage of the efficiencies available with software-defined assets, building a custom CLI scaffolding utility to help them onboard new data sources. The utility automated new data pipeline creation, providing interactive database selection, schema discovery, and deployment to Dagster. The CLI reduced setup time from days to roughly one hour, significantly accelerating new integrations and developer onboarding.

"Previously setting up a new data pipeline would take a day or two—now it takes about an hour." - Jonathan Phoon, Lead Data Engineer

Achieving end-to-end lineage across 5 Kubernetes clusters

StashAway operates 5 different Kubernetes clusters across the globe. As such, pipelines were monitored through five separate regional instances, causing significant friction during troubleshooting and a lack of end-to-end lineage.

Bolin Zhu, Data Engineer, described the setup:

“We were inspired by the concept of Dagster agents. We used that concept to build our own custom run launcher. We have a centralized Dagster web server that talks to five regional agents. The agent launches an in-memory instance that kicks off runs within those clusters.” In doing this, the team achieved an impressive result: end-to-end lineage for all pipelines across 5 regions.

"When something went wrong, we had to open five different tabs. It was unnecessary friction. Now we have true end-to-end lineage and a single pane of glass to manage all pipelines." - Bolin Zhu, Data Engineer

Adoption of Software Engineering Best Practices

Despite the growing complexity of their data platform, StashAway’s team remains lean and efficient by adapting their data workflows to align with the software development best practices, made possible with Dagster.

The team now tests at every step, running unit tests and validating pipelines locally with Dagster before pushing changes to staging–and then confidently to production.

They’ve also standardized their development environments, using tools like Nix for dependency management, poetry for Python packages, and devenv for setup. When a new engineer joins the team, they can quickly spin up a reliable workspace without guessing or troubleshooting.

Saving on data warehouse costs

Dagster’s clear lineage and granular orchestration allowed StashAway to identify which workloads needed to run, significantly cutting compute waste. This directly enabled a smooth migration from Redshift to Databricks, resulting in a 50% reduction in warehouse spend.

"We dropped about 50% in warehouse costs when migrating from Redshift to Databricks, enabled by Dagster." - Jonathan Phoon, Lead Data Engineer

An ever-evolving data stack with Dagster at the center

StashAway implemented Dagster over time, progressively introducing software-defined assets, automating pipeline creation, and ultimately consolidating regional instances into a single, unified control plane. Jonathan summarized the current state of their experience:

"Dagster allows our lean team to manage our multi-region infrastructure with ease. Its intuitive logging, debugging, and retry capabilities have drastically simplified operations, freeing us to focus on higher leverage work."

Highlights for the team include:

Multi-region visibility, consolidating 5 separate Kubernetes clusters into one centralized Dagster dashboard for unified observability and end-to-end lineage.
95% faster pipeline onboarding, reducing setup from days to approximately an hour with automated pipeline scaffolding tools.
Reliable SLA compliance, ensuring daily data readiness through proactive data quality checks, automatic retries, and improved troubleshooting capabilities.
Simplified issue identification and remediation, using Dagster’s intuitive UI and clear dependency visualization to quickly isolate failures.
Enjoyable developer workflows, with clear reproducible environments and thorough testing at all stages.
Improved team productivity and onboarding, enabling junior engineers to immediately contribute through streamlined workflows and comprehensive documentation.
Reduced tool spend, eliminating dependency on external tools, and allowing for migration to lower-cost options through Dagster’s comprehensive functionality.

Dagster powering StashAway’s long-term data strategy

Dagster has become foundational to StashAway’s data strategy, establishing trust with data consumers through consistency and reliability. Future integrations planned include Sling and further exploration of SQL Mesh.

"Our team is lean, but Dagster lets us handle thousands of assets at high velocity—we couldn’t have done that with any other orchestration platform." - Jonathan Phoon, Lead Data Engineer

‍

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.