Customers
Routing LLM prompts with Dagster and Not Diamond

Routing LLM prompts with Dagster and Not Diamond

February 14, 2025
Routing LLM prompts with Dagster and Not Diamond

Learn how LLM routing with Not Diamond can improve accuracy, and cost savings in your AI workflows

When developing an AI application, you've likely encountered these common challenges:

  • Which model is best suited for my specific needs?
  • How can I be certain my application is functioning as intended?

The inherent non-determinism of Large Language Models (LLMs) and the numerous options available make it essential to thoroughly evaluate and test each model to ensure production-ready performance. However, even the most effective model overall might struggle with certain edge cases, while other models might offer significant cost savings without compromising performance for most inputs.

A promising solution to this dilemma is to employ an LLM router, which can learn to dynamically select the optimal LLM for each query, thereby enhancing accuracy by up to 25% and reducing both inference costs and latency by a factor of 10.

How model routing works

When analyzing any dataset distribution, it's uncommon to find a single model which consistently outperforms all others across every possible query. Model routing addresses this by creating a "meta-model" that integrates multiple models and intelligently determines when to utilize each LLM. This approach not only surpasses the performance of any individual model but also reduces costs and latency by strategically employing smaller, more economical models when performance won't be compromised.

Routing in a Dagster pipeline

You can now add LLM routing to your Dagster pipelines with Not Diamond by instantiating a NotDiamondResource . Specifically,

  • create an asset with your desired routing options and the prompt in question,
  • pass the recommended model to your usage of the LLM provider of your choosing,
  • then materialize that asset as needed

Let's walk through a simple pipeline example.

First, we'll define an asset representing a hypothetical dataset called book_review_data providing a list of book reviews for our favorite book, "Cat's Cradle" by Kurt Vonnegut.

import time

import dagster as dg
import dagster_notdiamond as nd
import dagster_openai as oai


@dg.asset(kinds={"python"})
def book_review_data(context: dg.AssetExecutionContext) -> dict:
    data = {
        "title": "Cat's Cradle",
        "author": "Kurt Vonnegut",
        "genre": "Science Fiction",
        "publicationYear": 1963,
        "reviews": [
            {
                "reviewer": "John Doe",
                "rating": 4.5,
                "content": "A thought-provoking satire on science and religion. Vonnegut's wit shines through.",
            },
            {
                "reviewer": "Jane Smith",
                "rating": 5,
                "content": "An imaginative and darkly humorous exploration of humanity's follies. A must-read!",
            },
            {
                "reviewer": "Alice Johnson",
                "rating": 3.5,
                "content": "Intriguing premise but felt a bit disjointed at times. Still enjoyable.",
            },
        ],
    }
    context.add_output_metadata(metadata={"num_reviews": len(data.get("reviews", []))})
    return data

Our goal is to generate a summary of these reviews in a cost effective way through LLM routing.

So we'll define another asset, book_review_summary which uses the NotDiamondResource for model routing, and the OpenAIResource for completion. We call invoke the model_select method from our Not Diamond resource passing in our summarization prompt, the subset of models that we want to consider, and the parameter tradeoff="cost" parameter to optimize for cost savings.

This call returns a best_llm variable, which we can then pass to OpenAI in our usage of the chat.completions.create method.

@dg.asset(
    kinds={"openai", "notdiamond"}, automation_condition=dg.AutomationCondition.eager()
)
def book_reviews_summary(
    context: dg.AssetExecutionContext,
    notdiamond: nd.NotDiamondResource,
    openai: oai.OpenAIResource,
    book_review_data: dict,
) -> dg.MaterializeResult:
    prompt = f"""
    Given the book reviews for {book_review_data["title"]}, provide a detailed summary:

    {'|'.join([r['content'] for r in book_review_data["reviews"]])}
    """

    with notdiamond.get_client(context) as client:
        start = time.time()
        session_id, best_llm = client.model_select(
            model=["openai/gpt-4o", "openai/gpt-4o-mini"],
            tradeoff="cost",
            messages=[
                {"role": "system", "content": "You are an expert in literature"},
                {"role": "user", "content": prompt},
            ],
        )
        duration = time.time() - start

    with openai.get_client(context) as client:
        chat_completion = client.chat.completions.create(
            model=best_llm.model,
            messages=[{"role": "user", "content": prompt}],
        )

    summary = chat_completion.choices[0].message.content or ""

    return dg.MaterializeResult(
        metadata={
            "nd_session_id": session_id,
            "nd_best_llm_model": best_llm.model,
            "nd_best_llm_provider": best_llm.provider,
            "nd_routing_latency": duration,
            "summary": dg.MetadataValue.md(summary),
        }
    )

Finally, we return a MaterializeResult with the metadata from our call to both Not Diamond and OpenAI.

Once we materialize these assets we can see the summary of our book reviews, along with the otherassociated metadata!

  • session_id which we can use in subsequent routing requests,
  • best_llm as recommended by Not Diamond, and
  • routing_latency for time (in seconds) taken to fulfill the request.
Dagster global asset lineage

Here we demonstrated how perform model routing, but it's worth noting that Not Diamond also supports automatic routing through their model gateway! You can find an example of that in our recent deep dive presentation, and in the community-integration repository.

Routing prompts in complex workflows

You can expand this pipeline in various ways:

  • Try submitting dynamic prompts to Not Diamond from previous pipeline nodes or even your data,
  • Explore the model gateway to automatically route across LLM providers: OpenAI, Anthropic, Gemini, and more!
  • Combine both of the above with Dagster Pipes to build a fully-automated workflow which leverages generative AI for your custom business logic.

Conclusion

To try out this example or add Not Diamond’s state-of-the-art routing to your Dagster pipelines, sign up to Not Diamond to get your API key and read the docs to learn more about adding LLM routing to your AI workflows.

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

Monorepos, the hub-and-spoke model, and Copybara
Monorepos, the hub-and-spoke model, and Copybara
Blog

April 3, 2026

Monorepos, the hub-and-spoke model, and Copybara

How we configure Copybara for bi-directional syncing to enable a hub-and-spoke model for Git repositories

Making Dagster Easier to Contribute to in an AI-Driven World
Making Dagster Easier to Contribute to in an AI-Driven World
Blog

April 1, 2026

Making Dagster Easier to Contribute to in an AI-Driven World

AI has made contributing to open source easier but reviewing contributions is still hard. At Dagster, we’re improving the contributor experience with smarter review tooling, clearer guidelines, and a focus on contributions that are easier to evaluate, merge, and maintain.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on how to scale data teams
Guide

November 5, 2025

Download the eBook on how to scale data teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the e-book primer on how to build data platforms
Guide

February 21, 2025

Download the e-book primer on how to build data platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.