Blog
Untangling Python Packages Part 2

Untangling Python Packages Part 2

August 7, 2025
Untangling Python Packages Part 2
Untangling Python Packages Part 2

A deep dive into how Dagster leverages pyproject.toml for modern Python packaging, from project metadata and dependencies to build systems and development tooling.

In the previous blog post, we explored the role of `pyproject.toml` in Python packaging and why it serves as the primary interface for modern Python projects. Now, we’ll take a closer look at the `pyproject.toml` generated by `uvx create-dagster` when scaffolding a new Dagster project. We’ll walk through our design decisions and show how we use this file to streamline Python development.

As a quick refresher, `pyproject.toml` defines the build system for your package. It provides a universal interface that enables a range of Python ecosystem tools to work together seamlessly. By examining its various sections, you’ll gain insight into how to configure your Dagster project for your specific needs.

Project Metadata and Dependencies

[project]
name = "my_project"
requires-python = ">=3.9,<3.13"
version = "0.1.0"
dependencies = [
    "dagster==1.11.3",
]

This core section, defined by PEP 621, contains key metadata about your project. It includes essentials like the project’s name, version, and the Python versions it supports, as well as the core dependencies your project requires.

For a Dagster project, it makes sense to include `dagster` here. You can also list additional dependencies, such as Dagster-specific libraries (e.g. `dagster-duckdb` ) or any libraries your code needs (e.g. `pandas` ).

dependencies = [
    "dagster==1.11.3",
    “dagster-duckdb”,
    “pandas<=2.3”,
]

You can choose whether to pin specific dependency versions. While pinning is generally a good practice for reproducibility, it’s not strictly required within `pyproject.toml`. If you’re using a package manager such as uv, you can resolve and lock dependencies by running:

uv lock

This generates a `uv.lock` file with the exact versions in use. And while we prefer `uv`, the same setup works with `pip` as well.

Named Dependencies

[dependency-groups]
dev = [
    "dagster-webserver",
    "dagster-dg-cli[local]",
]

Not all dependencies need to be listed in the main `dependencies` section. PEP 735 introduces a standardized way to define named dependency groups: collections of packages that aren’t part of the built distribution, mapped to specific keys.

Before `pyproject.toml`, Python projects often used separate `requirements.txt` files for different dependency sets, such as `requirements-dev.txt for development. With `dependency-groups`, you can keep all dependencies (core and optional) in one place, making it easier to resolve and understand your project’s complete dependency map.

In a newly scaffolded Dagster project, you’ll find two `dev` dependencies by default:

  • `dagster-webserver`  – Runs the Dagster UI so you can explore and test pipelines interactively.
  • `dagster-dg-cli` – Provides the full `dg` CLI, making it easy to scaffold, navigate, and manage your project.

These tools are invaluable during development but unnecessary when uploading code locations for production use.

Named dependencies are also a good place to put other development tools like linters and test frameworks:

[dependency-groups]
dev = [
    "dagster-webserver",
    "dagster-dg-cli[local]",
]
testing = [
    "pytest",
    “ruff”,
]

By default, `uv sync` installs both your core and all named dependencies. To install only a specific group, use the `--group` flag:

uv sync --group testing

To exclude all named dependencies, use:

uv sync --no-dev

Build System

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

In the previous post, we covered the difference between a build system and a build backend. Separating these two concepts allows you to change the backend while keeping the interface consistent.

By default, Dagster projects use Hatchling (`hatchling.build`) as the backend. Hatchling is a modern alternative to older tools like `setuptools`, offering a faster and more streamlined build process. That said, `pyrproject.toml` supports many possible backends, so you can choose one that best fits your workflow.

Build Backend

Python Build Backend Table
Build Backend 'requires' 'build-backend'
setuptools ["setuptools", "wheel"] setuptools.build_meta
Flit ["flit_core"] flit_core.buildapi
Poetry ["poetry-core"] poetry.core.masonry.api
PDM ["pdm-backend"] pdm.backend

Given our fondness for `uv` we may use the uv build backend in the future.

Tools

[tool.dg]
directory_type = "project"

[tool.dg.project]
root_module = "my_project"
registry_modules = [
    "my_project.components.*",
]

The final section of the generated `pyrpoject.toml` configures `dg`in the [tool.dg*] blocks. These settings define how your Dagster project is structured so that `dg` commands work correctly. Specifically, `dg` needs to know:

  • root_module - The top-level Python package for your project.
  • registry_modules – The locations where components (e.g., assets, jobs, sensors) should be registered.

If you’re creating a new Dagster project from scratch, you usually don’t need to change this section. However, it can be useful when migrating an existing project into the dg structure, as you may need to update the module paths to match your project’s layout.

While this is the only [tool.*] section Dagster sets up during scaffolding, you can add more for other tools. For example, if you use ruff (another Dagster favorite) for linting and formatting, you can configure project-specific rules directly in a [tool.ruff] block.

[tool.ruff]
line-length = 100

Ruff has its own configuration file type (`ruff.toml`) but it is much more clear to define these types of rules within the pyproject alongside configuration rules for other tools.

Overview

Hopefully this walkthrough clarifies how the various pieces of `pyrpoject.toml` and related tools come together to form a Dagster project. Revisiting the layers of packaging we discussed in the previous post, the structure now looks something like this:

While this isn’t the only way to configure a Dagster project, the pattern we’ve covered is a solid starting point. You can swap out individual sections, change tooling, or add new configuration blocks as your needs evolve.

Understanding how these parts work together not only helps you work more effectively with Dagster, it also deepens your grasp of modern Python packaging as a whole.

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

Monorepos, the hub-and-spoke model, and Copybara
Monorepos, the hub-and-spoke model, and Copybara
Blog

April 3, 2026

Monorepos, the hub-and-spoke model, and Copybara

How we configure Copybara for bi-directional syncing to enable a hub-and-spoke model for Git repositories

Making Dagster Easier to Contribute to in an AI-Driven World
Making Dagster Easier to Contribute to in an AI-Driven World
Blog

April 1, 2026

Making Dagster Easier to Contribute to in an AI-Driven World

AI has made contributing to open source easier but reviewing contributions is still hard. At Dagster, we’re improving the contributor experience with smarter review tooling, clearer guidelines, and a focus on contributions that are easier to evaluate, merge, and maintain.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on how to scale data teams
Guide

November 5, 2025

Download the eBook on how to scale data teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the e-book primer on how to build data platforms
Guide

February 21, 2025

Download the e-book primer on how to build data platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.