Customers
Case Study: BenchSci - A Leap Forward with Dagster

Case Study: BenchSci - A Leap Forward with Dagster

February 20, 2024
Case Study: BenchSci - A Leap Forward with Dagster

Learn about how BenchSci uses Dagster in their journey to expedite drug development.

Speed is a given within the pharmaceutical industry, with the race to market not only depending on the quickness of an organization but also its ability to innovate, understand, and adapt. Bringing groundbreaking medical treatments from concept to reality presents a significant challenge: managing the tide of data from countless research avenues and clinical trials.

BenchSci stands out in this race, employing cutting-edge AI technology to rapidly bring life-saving medicines to market. Their ASCEND technology is a science-first disease biology GenAI platform that acts as an AI assistant to preclinical research and development scientists to discover and evaluate experimental data, optimize research strategies, and mitigate potential risks to ensure the efficiency and effectiveness of scientific experimentation.

In a deep-dive conversation with the BenchSci analytics team, we learned how they're transforming the industry, with Dagster playing a pivotal role in their journey to expedite drug development.

The Analytics Team

With a diverse team of six, comprising product analysts and data analysts, BenchSci's analytics unit goes beyond mere number-crunching and reporting. They provide critical data insights into the performance of BenchSci’s core product suite. Their toolset, which includes Dagster, forms the backbone of a system designed to distill vast data sets into actionable insights.

From Major Challenges at BenchSci to Adopting Dagster

The Pre-Dagster Challenges

Before building on Dagster, BenchSci faced several key challenges that impeded its analytics data operations:

  • A Growing Set of Heterogeneous Tools: While the teams' toolset was still relatively straightforward, it was clear that, as the team built out their platform, the current orchestration approach would be a bottleneck.
  • A Lack of Observability: To properly manage the expanding number of data sources and growing sophistication of their processes, the team needed a single pane of glass that gave them precise insights into the current state of their analytics data pipelines.
  • Cost Management: The team was aware that many processes could be better optimized for spend, but this required both observability and the right framework for pinpoint control and conditional execution.

In response to these challenges, BenchSci sought a solution to bring order and efficiency to their data management processes.

Choosing Dagster for Advanced Data Orchestration

BenchSci ultimately selected Dagster because of its ability to offer:

  • Effective Data Management: Dagster promised to manage the data load while clarifying and directing BenchSci's analytics workflows.
  • Isolated Development Environments: With Dagster Cloud’s CI process, each team member can seamlessly spin up their own ephemeral deployments for sandboxing and testing, merging changes to production with confidence.
  • Seamless Integration: It could integrate seamlessly with their existing and future tools, providing a single source of truth for their data.
  • Event-Driven Automation: Dagster's core Software Defined Asset abstraction – whether a database table, report, or ML model – was crucial for managing modern data workflows. This aspect of Dagster was particularly appealing as it aligned with BenchSci's vision for efficient and responsive data handling.
  • Ease of Deployment: Compared to other solutions, deploying Dagster was much easier and resulted in a more stable and easier-to-manage setup. Dependencies in other solutions they evaluated caused major headaches and lost time for the team.

Each team member has their own dev environment within which they work. With Dagster’s setup, they can just clone tables from production into their dev datasets. This saves the team a lot of time and computing costs.

If an error does occur upstream, the team now notifies the stakeholders of the impact on the dashboards.

Strategic Decision: Dagster Over Other Options

The decision to choose Dagster over alternatives was a strategic one. Specifically, BenchSci valued Dagster's approach to data assets and its suitability for event-driven automation, logging, and retries in data-rich environments. This approach allowed BenchSci to address specific areas of redundant compute to eliminate costs.

Dagster's Impact on BenchSci

The deployment of Dagster revolutionized BenchSci's data analytics operations, streamlining complex workflows and enhancing data reliability.

"Dagster acts as a traffic controller," said the team, emphasizing its role in connecting disparate data sources and analytics tools.

Because the framework encourages the optimization of data pipelines—and provides the right observability—running on Dagster allowed the team to achieve a marked reduction in computational costs and data errors. This is achieved by only materializing assets (i.e. running compute) when there is a clear benefit of doing so. Cost reduction was also achieved by reducing test runs, full-batch replication, and processing of poor-quality data.

Insights into how the data platform was performing further enabled BenchSci to tackle the complexity of its data ecosystem head-on.

By integrating and managing data from diverse sources through a coherent orchestration platform, Dagster provided BenchSci with a comprehensive cataloged view of its data assets.

Analytics As A Product

Dagster's robust data orchestration greatly supported BenchSci's transition to viewing analytics as a product. It enabled the analytics team to derive actionable insights from platform usage and business decisions.

“We’ve been focused in the last year on ‘analytics as a product’. We can serve the teams with a hands-off approach, where stakeholders can help themselves. We break away from handling request after request. Now we can just let Dagster run, and only step in when we need to.” says the team.

With plans to integrate Dagster's capabilities further into its analytics framework, the team's vision for a data-centric, agile, and interconnected future is well on its way.

Conclusion

BenchSci's deployment of Dagster highlights an essential trend in modern business: the critical role of advanced data orchestration in meeting today's complex demands. Their success story exemplifies how Dagster enables organizations to swiftly understand and trust their data, adapt to new challenges, and monetize their data assets. Additionally, BenchSci's story demonstrates the growing importance of sophisticated data management tools in an era where the strategic utilization of data is critical to business success.

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 13, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

Dagster 1.13: Octopus's Garden
Dagster 1.13: Octopus's Garden
Blog

April 9, 2026

Dagster 1.13: Octopus's Garden

Dagster skills, partitioned asset checks, state backed components, virtual assets, and stronger integrations.

Monorepos, the hub-and-spoke model, and Copybara
Monorepos, the hub-and-spoke model, and Copybara
Blog

April 3, 2026

Monorepos, the hub-and-spoke model, and Copybara

How we configure Copybara for bi-directional syncing to enable a hub-and-spoke model for Git repositories

Making Dagster Easier to Contribute to in an AI-Driven World
Making Dagster Easier to Contribute to in an AI-Driven World
Blog

April 1, 2026

Making Dagster Easier to Contribute to in an AI-Driven World

AI has made contributing to open source easier but reviewing contributions is still hard. At Dagster, we’re improving the contributor experience with smarter review tooling, clearer guidelines, and a focus on contributions that are easier to evaluate, merge, and maintain.

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on How to Scale Data Teams
Guide

November 5, 2025

Download the eBook on How to Scale Data Teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the eBook Primer on How to Build Data Platforms
Guide

February 21, 2025

Download the eBook Primer on How to Build Data Platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.