Blog
New Dagster Integration: Include OpenAI Calls Into Your Data Pipelines

New Dagster Integration: Include OpenAI Calls Into Your Data Pipelines

March 11, 2024
New Dagster Integration: Include OpenAI Calls Into Your Data Pipelines
New Dagster Integration: Include OpenAI Calls Into Your Data Pipelines

The new dagster-openai integration lets you tap into the power of LLMs in a cost-efficient way.

We are pleased to announce a new integration that will allow data practitioners to easily include OpenAI API calls as part of their data pipelines.  Besides generating AI responses, the integration provides insights that let you optimize your API calls and credit consumption.

Building Generative AI Steps into Your Data Pipelines

There are many potential uses of an OpenAI API call in a data pipeline.  Allow me to quote an expert on the matter: ChatGPT:

OpenAI's API, including services powered by GPT (like ChatGPT), Codex, and DALL·E, has found a wide range of applications across various industries. Below are some of the most popular use cases adopted by large corporations:

Top Use Cases of OpenAI's API:

  1. Customer Service Automation
    – Deploying AI chatbots to handle customer queries, provide 24/7 support, and reduce response times.
  2. Content Generation
    – Writing articles, social media posts, product descriptions, and marketing copy at scale.
  3. Data Analysis and Insights
    – Processing large datasets, extracting key insights, and generating summaries or reports.
  4. Language Translation and Localization
    – Translating content and adapting it to different languages and cultural contexts.
  5. Personalized Recommendations
    – Suggesting products, content, or services based on user preferences and behavior.
  6. Educational Tools and Tutoring
    – Assisting learners with explanations, practice problems, and personalized study help.
  7. Document Automation and Summarization
    – Automating repetitive document tasks and summarizing long-form content quickly.
  8. Creative Design and Art
    – Using tools like DALL·E to generate images, design concepts, and visual assets.
  9. Financial Analysis and Forecasting
    – Assisting with interpreting financial data, generating forecasts, and risk assessment.

While we can't yet picture all of the possible use cases of a generative AI step in a data pipeline, here are some scenarios that seem valuable:

  • Submit a large document to OpenAI's API and request a summary of the document.
  • Submit a customer testimonial and request a standardized classification for sentiment analysis
  • Submit foreign language text and request a local translation

Here at Dagster Labs, we've used dagster-openai to summarize the category and generate learning summaries from GitHub issues and discussions.  Our pipeline handles complex support requests. It provides a first-stab answer to user questions (speeding up our support team's response times), auto-categorizes the issue, and generates learning summaries on a weekly basis.

Keeping Your Costs Under Control

While generative AI offers a broad spectrum of capabilities, managing costs is essential. Dagster Labs is committed to providing the necessary tools to build your pipeline for optimal cost efficiency and performance. To this end, we introduce the OpenAIResource alongside the with_usage_metadata function from our library, ensuring uniform resource utilization across our platform.

Both Dagster Cloud and open-source users can take advantage of these features to monitor and optimize their data pipelines. For Dagster Cloud users, this functionality is seamlessly integrated with Dagster Insights, providing an enhanced experience with additional analytical capabilities. Meanwhile, open-source users can also leverage these tools and log their metadata, which they can then visualize as a metadata plot directly in the UI.  

This unified approach ensures all users can effectively control their costs while maximizing the benefits of generative AI.

A screenshot of the Dagster UI showing the OpenAI API consumption graph.
   A screenshot of the Dagster Cloud "Insights" UI showing the OpenAI API consumption graph.  

           Explore the guide here        

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

Monorepos, the hub-and-spoke model, and Copybara
Monorepos, the hub-and-spoke model, and Copybara
Blog

April 3, 2026

Monorepos, the hub-and-spoke model, and Copybara

How we configure Copybara for bi-directional syncing to enable a hub-and-spoke model for Git repositories

Making Dagster Easier to Contribute to in an AI-Driven World
Making Dagster Easier to Contribute to in an AI-Driven World
Blog

April 1, 2026

Making Dagster Easier to Contribute to in an AI-Driven World

AI has made contributing to open source easier but reviewing contributions is still hard. At Dagster, we’re improving the contributor experience with smarter review tooling, clearer guidelines, and a focus on contributions that are easier to evaluate, merge, and maintain.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on how to scale data teams
Guide

November 5, 2025

Download the eBook on how to scale data teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the e-book primer on how to build data platforms
Guide

February 21, 2025

Download the e-book primer on how to build data platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.