Blog
The Data Engineering Impedance Mismatch

The Data Engineering Impedance Mismatch

April 10, 2024
The Data Engineering Impedance Mismatch
The Data Engineering Impedance Mismatch

A case for asset-oriented over workflow-oriented in data orchestration.

>    Pete Hunt explains the impedance mismatch in data platforms today during Data Council 2024.  

The Primary Concern of Data Engineering

The primary concern of data engineering is building and maintaining data assets such as dbt models, data warehouse tables, and even dashboards and ML models.

Data assets are tangible, have meaning, and are ultimately what our data work revolves around. Naturally, most tools in the data engineering tech stack take an asset-oriented approach.  Whether it’s Snowflake, Monte Carlo, Fivetran, dbt, or Airbyte, all modern data stack tools from ingestion to visualization adopt the data asset (by one name or another) as their central concept.

The Orchestration Impedance Mismatch

And then there are orchestration tools.

In contrast to the rest of the (asset-oriented) data platform, most orchestration tools are workflow-oriented. They concern themselves with executing a series of black-box tasks.

Examples of these workflow-oriented tools include Airflow, Prefect, and GitHub Actions. They are Swiss Army knives. They can orchestrate data pipelines, microservices, compile code, and more. On the surface, these seem like great, flexible tools.

The problem arises when using these Jack-of-all-trades orchestrators for asset-oriented work like building data pipelines. You get an impedance mismatch: they don’t work well with the rest of the stack. The mismatch results in all sorts of headaches: clunky integrations, fragmented dataflows, and limited observability and understanding of how your data flows from point A to point B.

"An impedance mismatch refers to the discrepancy between two systems' data representations or communication protocols, causing inefficiencies or errors in data transfer or integration."

Placing a workflow-oriented orchestrator at the center of your otherwise asset-oriented data platform creates as many problems as it solves, forcing developers to continuously translate between the workflow-oriented world of the orchestrator and the asset-oriented world of the rest of the data platform.

An illustration of how workflow-centric orchestrators are a poor match to todays asset-oriented tools.
   Why place a workflow-oriented orchestrator in the center of your asset-oriented data platform?  

What Asset-Orientation Enables

We built Dagster to be an asset-oriented data orchestrator and fix this impedance mismatch. Immediately, we were able to realize several benefits on behalf of our customers.

First, asset-orientation enabled super developer experience. Platform owners and pipeline builders were able to write code and visualize their pipelines in a very natural style. Tasks which had previously been a huge hassle became fun.

Second, asset-orientation enabled the dream of the decentralized data platform. With workflow-oriented orchestrators, large organizations were unable to adopt data mesh best practices because the tangled web of dependencies was difficult to manage. By overlaying a single, global asset graph over the whole organization - as an asset-oriented orchestrator does - large organizations were able to empower individual teams to work autonomously while also enabling them to depend on each other.

Furthermore, this approach enabled centralized data platform teams to maintain a “single pane of glass” over the whole platform.

However, as we worked with customers, we realized that the asset-oriented approach enabled us to deliver value beyond the scope of traditional orchestrators.

Dagster+: Moving Beyond Traditional Orchestration

Dagster’s asset-oriented fundamentals lays the foundation for additional capabilities that leverage the lineage and metadata of the assets. Dagster+ brings new capabilities to the category that leverage this core asset-oriented data model.

Data Reliability, Quality and Freshness

The first capability we are building into the orchestration layer is data reliability: a suite of capabilities that encompasses quality and freshness checks.  Colocated with the definition of our assets, we can state our expectations for completeness, pattern matching (i.e. non-nulls, phone numbers, etc.), shape, and update frequency of a data asset. This makes Dagster the single pane of glass for both operational and quality information, and also enables Dagster to make orchestration decisions based on the result of reliability checks.

Dagster's Asset Checks comntinue to evolve.

Operational Insights

An asset-oriented orchestrator is the source of truth for both the asset graph and the underlying compute that materializes assets. This implies that an asset-oriented orchestrator is the natural place to aggregate, visualize, alert on and optimize data pipeline cloud spend.

A screenshot of Dagster+ operational insights capabilities.

Data Cataloging

A final high-value-add capability of an asset-oriented system is that it already collects all the critical metadata related to your data assets, and therefore is uniquely placed to maintain an always-up-to-date data catalog. With Dagster+ we are launching a brand new data catalog experience that includes advanced features like column-level lineage. This catalog serves every data practitioner: from those owning the platform, to those building data pipelines, and data consumers leveraging the outputs of those data pipelines.

A screenshot of the data cataloging feature in Dagster+.

Looking Ahead

With Dagster+, we envision a future where data orchestration is fully integrated into the data platform, reducing complexity, enhancing transparency, and fostering collaboration. The asset-oriented approach addresses the technical impedance mismatch and aligns with the organizational and operational needs of modern data teams. By focusing on assets, Dagster+ not only streamlines the development and operation of data pipelines but also paves the way for innovative features and capabilities that leverage the full potential of the data stack.

           On April 17th, we released Dagster+, a new hosted solution from Dagster Labs that challenges the limitations of traditional orchestration tools.        

              Watch the launch event video            

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

Unlocking the Full Value of Your Databricks
Unlocking the Full Value of Your Databricks
Blog

March 12, 2026

Unlocking the Full Value of Your Databricks

Standardizing on Databricks is a smart strategic move, but consolidation alone does not create a working operating model across teams, tools, and downstream systems. By pairing Databricks and Unity Catalog with Dagster, enterprises can add the coordination layer needed for dependency visibility, end-to-end lineage, and faster, more confident delivery at scale.

Announcing AI Driven Data Engineering
Announcing AI Driven Data Engineering
Blog

March 5, 2026

Announcing AI Driven Data Engineering

AI coding agents are changing how data engineers work. This Dagster University course shows how to build a production-ready ELT pipeline from prompts while learning practical patterns for reliable AI-assisted development.

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on how to scale data teams
Guide

November 5, 2025

Download the eBook on how to scale data teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the e-book primer on how to build data platforms
Guide

February 21, 2025

Download the e-book primer on how to build data platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.