Blog
Escaping the Modern Data Trap

Escaping the Modern Data Trap

September 28, 2023
Escaping the Modern Data Trap
Escaping the Modern Data Trap

Launch Week kicks off October 9th with new functionality being shared each day. Our theme: Escaping the Modern Data Trap!

Dagster Launch Week starts October 9th. Each day from October 9th to the 13th, we’ll be announcing a new feature, project, or capability that we think users will love.

The theme of Launch Week is “Escaping the Modern Data Trap.” We’ll get to what that means in a moment. But first, I want to recap how we got to this point and why we are doing a Launch Week in the first place.

Watch the replay of Dagster Launch Week.                

We entered 2023 at an interesting stage in the company’s lifecycle. We had released Dagster 1.0 and Dagster Cloud about five months prior, and I was just a few weeks into my new gig as CEO.

On the one hand, we had recently become the fastest-growing data orchestrator and were exceeding our growth goals. On the other, we were kicking off our series B during the worst fundraising environment of my career (fortunately, this ended up working out).

These events triggered serious discussions about where we were going to take the product and the business over the next year. We talked to lots of users, and summarized our thinking earlier this year in the Dagster Master Plan.

Throughout the course of our discussions with data and machine learning engineers, negative feedback about the Modern Data Stack kept coming up. Internally, we started referring to this theme as the Modern Data Trap.

>    Nick Schrock (CTO and founder of Dagster Labs) and Pete Hunt (CEO) review the current state of data engineering and explain how the Modern Data Stack is failing to deliver on its original premise and how to get back on track.  

The Modern Data Trap

Our users are data engineers and data platform engineers at a wide variety of company sizes, industries, and levels of seniority.

When we talked to them, we heard a number of positive things about the Modern Data Stack:

  • Raising a big tent. Tools like dbt™ Core brought new stakeholders into the software engineering process.
  • Fewer home-grown tools. Companies stopped building their own version of tools like SQL templating, data catalogs, and SaaS ELT connectors in favor of pulling tools off the shelf.
  • Cloud adoption. The Modern Data Stack accelerated the adoption of cloud technologies, which reduced fixed costs and improved developer happiness.

We also heard about many critical problems with the Modern Data Stack:

  • Assumed homogeneity. Many tools in the Modern Data Stack assume you live in an idealized world where there is a single, modern data warehouse. In reality, most businesses have hundreds of different legacy data sources that must be integrated together.
  • Too many disconnected tools. Everyone has seen the ridiculous market maps of the Modern Data Stack. Data teams are managing dozens - if not hundreds - of open-source, on-prem, and cloud tools, which adds enormous complexity to the stack.
  • Too expensive. The explosion of tools comes with an explosion in SaaS fees. This is compounded by the challenging macroeconomic environment of the past two years.
  • Inflexibility. Because so many Modern Data Stack tools jumped on the low- and no-code bandwagon, they are now very hard to customize. If your requirements slightly deviate from your tool’s view of the world, you may need to start from scratch.
  • Ignorance of software engineering best practices. Again, low- and no-code tools try to avoid software engineering rather than embrace it. This means that software engineering best practices like testing, continuous delivery, and observability are not ubiquitously adopted.

Take, for example, data movement tools like Fivetran or Stitch. On the surface, they are simple: they move data from some external system into your data warehouse. While you’ve solved one problem, you’ve created several new ones:

  • How do you transform the data? Now you need dbt Core (recently bundled with Fivetran), PySpark, or another transformation tool.
  • What if you need to integrate with internal services, existing codebases or another data warehouse? Now you need an orchestrator like Dagster.
  • How do you ensure your data is correct? Now you need to adopt a data quality tool.
  • How do you activate the data? You need a data activation or reverse ETL tool.
  • Now I have data in many different systems. How do I keep track of it? You need a data catalog and governance suite.

This situation is OK if you are a small business, have simple requirements, and cannot afford to hire data engineers. However, as your needs get more demanding, Big Complexity rears its ugly head. We believe that software engineering best practices are the only way to tame Big Complexity.

Escaping the Trap: From Orchestrator to Data Control Plane

We realized that we could solve these problems - and more - not by delivering yet another standalone product, but by evolving the definition of what a “data orchestrator” is. Rather than simply a “service that schedules compute,” we believe the orchestrator should be a true control plane for the Modern Data Stack.

Specifically, we believe that a large segment of Modern Data stack tools:

  1. Are overkill for data engineers. They are either expensive, expansive SaaS services or complex, multi-service OSS tools that include lots of heavyweight bells and whistles that many data engineers don’t need.
  2. Would benefit from deep orchestrator integration. Many tools in the Modern Data Stack either participate directly in the execution of data pipelines or rely on data where the orchestrator is the source of truth. Deep orchestrator integration can make these tools simpler, more robust, and more powerful.

Some customers truly need all the bells and whistles of a heavyweight external tool, and for those, we’ll continue to focus on high-quality integrations. However, we think that many users would be better served by lightweight features directly integrated into the orchestrator that are specifically targeted at data engineers.

Launch Week Agenda

At Launch Week, we’ll be showing off a number of these new bundled features and deep integrations to empower data engineers to have a greater impact and a radically improved developer experience.

Here’s the agenda.

  • Friday, October 6: Join us for a pre-Launch Week conversation about the state of the Modern Data Stack.
  • Monday, October 9: Sandy Ryza will be talking about data quality.
  • Tuesday, October 10: Jarred Colli will help data teams reduce their spend on Modern Data Stack tools with our new Dagster Insights feature.
  • Wednesday, October 11: Erin Cochran will discuss our recent investments in developer education and announce two very special new projects.
  • Thursday, October 12: Pedram Navid will show Dagster users how to reduce operational headaches and cash burn by changing how they do ELT.
  • Friday, October 13: Nick Schrock will talk about One More Thing.

We’re looking forward to seeing you there!

Pete and Nick

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

Unlocking the Full Value of Your Databricks
Unlocking the Full Value of Your Databricks
Blog

March 12, 2026

Unlocking the Full Value of Your Databricks

Standardizing on Databricks is a smart strategic move, but consolidation alone does not create a working operating model across teams, tools, and downstream systems. By pairing Databricks and Unity Catalog with Dagster, enterprises can add the coordination layer needed for dependency visibility, end-to-end lineage, and faster, more confident delivery at scale.

Announcing AI Driven Data Engineering
Announcing AI Driven Data Engineering
Blog

March 5, 2026

Announcing AI Driven Data Engineering

AI coding agents are changing how data engineers work. This Dagster University course shows how to build a production-ready ELT pipeline from prompts while learning practical patterns for reliable AI-assisted development.

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on how to scale data teams
Guide

November 5, 2025

Download the eBook on how to scale data teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the e-book primer on how to build data platforms
Guide

February 21, 2025

Download the e-book primer on how to build data platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.