MS Fabric vs. Dagster: Architecture Choices Matter

The fundamental challenge facing data teams today is building scalable platforms that enable self-service for data consumers.

A post recently hit the r/dataengineering subreddit that caught many data engineers' attention: MS Fabric destroyed 3 months of work. The author described how connecting their Fabric workspace resulted in all their artifacts being wiped irreversibly. When they reached out to Microsoft support, they were told it was "a known issue," but the documentation for that issue had conveniently been uploaded the same day.

And it's not just a one-off incident. A few weeks prior to that, another post titled Considering resigning because of Fabric rocketed to the top of the subreddit with over 500 upvotes. The story is painfully familiar: leadership decided to rip out Databricks, Snowflake, and Collibra to implement Fabric with Purview after Microsoft convinced executives it would be "cheaper and safer" from a governance perspective.

A year later, they've blown through their budget and migration timeline with no end in sight:

"We are now well over a year in. The funding has all been used up a long time ago. We are not remotely done and nobody is happy. We have used the budget for last year and this year on the migration which was supposed to be used on replatforming some of our apps."

This is the reality of betting your data platform on immature tools.

When "Enterprise Ready" Isn't

Microsoft Fabric is Microsoft's latest attempt to unify their data offerings. It's been out for about a year and a half, and despite heavy marketing to the enterprise, it's effectively still in beta. The comments on these Reddit posts tell the story:

"Pretty sure this is how most Microsoft tool lifecycles go. It's released to the public broken, users essentially opt in to a beta release that's not labelled as such find their bugs, in a few years the product will be ready to handle large production systems and then they'll drop all support for it with the newest shiny tool."

"I've worked some with fabric and it just all round seems unfit for production systems. Give it a few years and it'll get there but it's just not a mature product."

Every data platform engineer who's evaluated Fabric has concluded the same thing: it's not ready for prime time. One commenter noted that Fabric "doesn't have the width of Databricks but also not the simplicity of Snowflake. It simply doesn't do anything it's claiming better than any other vendor."

Yet Microsoft continues to aggressively promote it, even converting Power BI Premium licenses to Fabric licenses this summer to force adoption.

Why does this happen? As one commenter put it eloquently:

"Most decision making in those realities is pretty much just a constant attempt to push responsibility outside of the company. Plus you know, Microsoft is fairly good at selling stuff to execs. They know the language, so to speak."

The Data Platform Engineer's Dilemma

As data systems grow in complexity, the fundamental challenge facing data teams today isn't just building individual pipelines—it's building scalable platforms that enable self-service for data consumers. This shift from pipeline builders to platform enablers represents the evolution of the data engineering role.

This is where architectural choices matter. When evaluating orchestration tools like Dagster versus Fabric, organizations need to consider not just what solves their immediate problem, but what enables them to scale effectively over time.

The Hidden Problems of Fabric

While Microsoft's vision for Fabric isn't inherently flawed, their execution is. The basic issues plaguing Fabric today:

Immaturity: Critical features like source control integration are broken in fundamental ways. One commenter noted, "Fabric is completely useless for serious development, it's not fully compatible with git or CI/CD. It doesn't even offer proper work isolation."
No recovery options: When things break, there's often no way back. As the original poster who lost three months of work discovered.
Locked into the Microsoft ecosystem: Limited integration with the broader data ecosystem.
Built for analysts first: The developer experience is an afterthought.
Cost disconnects: The "we'll save money" promise rarely materializes. As the person considering resignation put it: "even if the final platform ends up costing what MSFT claims (which I do not believe) we will not break even before another 6 years due to the costs of the migration."

The reality on the ground is even more dire. Practitioners describe Fabric as:

"For spark and the semantic model crashed so much I couldn't complete the final assessment. Useless."

"The spark side of things is pure trash."

"It's hard to fathom how it is even possible to build a worse product than Synapse."

One user summed it up: "If a bug takes down something important, it's better that it's on our end than the vendor's. We can fix our bugs, we can't wait weeks or months hoping for them to fix something theirs."

Dagster: The Unified Control Plane for Modern Data Teams

Dagster takes a fundamentally different approach to data orchestration. Instead of focusing on tasks like Airflow or trying to be an all-in-one platform like Fabric, Dagster centers on data assets—the actual tables, files, models, and notebooks that make up your data platform.

Dagster's design philosophy aligns with what many forward-thinking platform engineers have been arguing for years: engineers should build platforms, services, and frameworks, not just ETL pipelines.

Key differentiators that make Dagster stand out:

Data-centric architecture: Organize around your actual data assets, not just tasks.
End-to-end lineage: Track data across your entire platform, down to the column level.
Developer experience: Local testing, branch deployments, and a modern Python-based framework.
Built-in data quality: Asset checks and testing are baked into the platform.
Unified control plane: A single interface for all your data pipelines.

Being Effective, Not Just Right

Being right about technology choices isn't enough—organizations need to be effective. This applies perfectly to platform architecture decisions:

Being right means you technically can orchestrate data pipelines. Being effective means your team can build, scale, and observe those pipelines confidently.

Effectiveness for a modern data platform means:

Developer productivity: Can your team iterate quickly, or are they fighting their tools?
Observability: How fast can you diagnose and fix issues?
Self-service capability: Can data consumers build their own pipelines without deep engineering knowledge?
Resource optimization: Are you getting the most from your cloud spend?
Data quality assurance: Do you have built-in checks and balances?

The Real Cost Calculation

When evaluating platforms, the true cost goes beyond licensing and infrastructure. The Reddit posts about Fabric destroying work and derailing careers illustrate the hidden costs of choosing the wrong platform. What's the total impact of:

Lost engineering time due to platform limitations or failures
Delayed business initiatives waiting on data
Reduced team morale from fighting with tooling
Technical debt accumulation from workarounds
Opportunity cost of delayed innovation
Career impact on your engineering team

One architect described how their role "has diminished because they took away the main component which is architectural advice and now we are in a perpetual state of emergency because nothing works and everything seems to be built like a jenga where a random piece gets pulled every hour."

Platform decisions have cascade effects throughout an organization. A platform that destroys work, lacks fundamental safeguards like recovery points, or requires constant babysitting doesn't just cost money—it costs competitive advantage.

The most successful data teams today recognize that platform reliability is a business differentiator. When data pipelines consistently deliver quality results on time, the business can confidently make decisions. When they don't, the entire organization pays the price through delayed insights, missed opportunities, and eroded trust in data.

Dagster was designed from the ground up to address these reliability and visibility concerns. The asset-based approach means teams can:

Understand exactly what data exists and how it relates to the rest of the platform
Troubleshoot issues with complete end-to-end visibility
Build reusable components that accelerate development
Implement quality checks that prevent bad data from propagating
Scale their platform without scaling their headcount

Make Decisions That Scale

Data teams need to be more thoughtful about allocating their limited engineering resources. Organizations should consider whether to invest in a platform that's still figuring out the basics, or one that's defining the future of data orchestration.

If you're choosing an orchestrator today, you have two paths:

Go with the legacy vendor that's still struggling with fundamental features like source control
Choose the orchestrator that's designed from first principles to address the actual needs of modern data platforms

As one data engineer facing a Fabric migration put it:

"The salary and benefits were pretty crazy, but not enough for me to lose my soul."

Your architecture choices matter. Choose wisely.

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.