Blog
What CoPilot Won’t Teach You About Python (Part 1)

What CoPilot Won’t Teach You About Python (Part 1)

July 23, 2025
What CoPilot Won’t Teach You About Python (Part 1)
What CoPilot Won’t Teach You About Python (Part 1)

Advanced Python features that AI agents may miss

Python is a joy to write. Part of the appeal is just how flexible it is. Thanks to its rich standard library, dynamic nature, and multiple paradigms, there are many ways to solve problems.

But flexibility can also hide some of the more interesting and powerful features of the language, especially if you’re primarily writing code with the help of AI agents. While AI agents can be incredibly helpful, they often default to safe, conventional solutions rather than code that really push the bounds of Python.

This doesn’t mean you should reach for the most advanced feature every time you write code. As the Zen of Python reminds us:

Simple is better than complex. Complex is better than complicated.

Still, it’s worth being aware of the deeper parts of the language. Many of Python’s advanced features exist for good reason, they power real-world, production-grade systems and can make your code more expressive, maintainable, or performant when used correctly.

We’ll highlight some of the advanced Python features we use ourselves across the Dagster codebases and explore how they might help you to elevate your code.

Slots

One Python feature your agent is unlikely to use are `__slots__`. By default, all Python class instances store their attributes in a per-instance dictionary (`__dict__`). This is part of what makes Python so flexible since each instance can be modified at runtime by adding new attributes dynamically.

class weekday(object):
   def __init__(self, weekday, n=None):
       self.weekday = weekday
       self.n = n


my_weekday = weekday("Monday")
my_weekday.new_attribute = 1

This dynamic behavior is convenient, but it comes at a cost: every instance carries the overhead of maintaining a `__dict__`. If we know in advance that the set of attributes on an object will not change, we can use  `__slots__`.

class weekday(object):
   __slots__ = ["weekday", "n"]


   def __init__(self, weekday, n=None):
       self.weekday = weekday
       self.n = n

This version of `weekday` behaves almost the same, but prevents dynamic attribute assignment.

my_weekday.new_attribute = 1
# AttributeError: 'weekday' object has no attribute 'new_attribute'

The tradeoff is worthwhile when you’re creating a large number of instances, as it significantly reduces memory usage by eliminating each instance's `__dict__`.

At Dagster, we use `__slots__` in performance-critical paths such as creating many immutable objects during background daemon processes. This can dramatically reduce memory usage and improve execution efficiency.

And even though `__slots__` are a fairly old feature of Python, having been around since Python 2.2, they have continued to evolve with the language. Since Python 3.10+ you can use `__slots__` as part of dataclasses to achieve the same memory savings.

from dataclasses import dataclass


@dataclass(slots=True)
class weekday:
   weekday: str
   n: int

lru_cache

Another effective way to improve performance is by using the `@lru_cache` decorator from the `functools` module. This decorator caches the return values of function calls, so repeated invocations with the same arguments can return instantly, without recomputing the result.

This is especially useful for expensive or computationally intensive functions where the output is deterministic and stable across calls.

For example, consider a method that performs introspection to determine whether a function accepts a context argument, a potentially expensive operation.

from typing import Any, NamedTuple
from functools import lru_cache


class DecoratedOpFunction(NamedTuple):


   @lru_cache(maxsize=1)
   def has_context_arg(self) -> bool:
       ...

In this case, `@lru_cache(maxsize=1)` is ideal because:

  • The result of `has_context_arg` is invariant for the lifetime of the object.
  • Caching the result avoids repeating a costly computation.
  • `maxsize=1` is sufficient since the method is usually called on a single instance and doesn’t depend on arguments.

This simple addition can significantly reduce latency in performance-sensitive paths, especially when combined with immutable or repeatedly accessed data.

Protocols

You’re probably familiar with inheritance in Python, but you might not know about metaclasses and more specifically, the role of protocols, a special kind of static metatype.

Introduced in Python 3.8, `Protocol` is a feature that enables static structural subtyping, also known as "duck typing with types." Rather than relying on inheritance, it checks whether a class has the required methods and properties to satisfy an interface.

At Dagster, we aim to make our code fully compliant with static type checkers like pyright (though we’re also very excited about ty and other emerging tools). Using `Protocol` allows us to define type-safe interfaces without enforcing inheritance, which helps us keep our code flexible and clean.

We often use protocols to define minimal contracts for objects like GraphQL results. Here's an example.

from typing import Protocol, Mapping, Any, Optional, Sequence


class GqlResult(Protocol):
  @property
  def data(self) -> Mapping[str, Any]: ...




  @property
  def errors(self) -> Optional[Sequence[str]]: ...

This protocol doesn't define any implementation. Instead, it specifies that any object with these two properties (`data` an `errors`) is considered a `GqlResult`. No subclassing is required and any class that implements the structure is valid.

You can now write functions that accept any object conforming to `GqlResult`, regardless of its actual class.

def _get_backfill_data(
   launch_backfill_result: GqlResult,
):
   ...

This approach:

  • Enforces type correctness at compile time.
  • Avoids rigid class hierarchies.
    Makes your code more testable and reusable.

Generics

Python's type system is highly expressive, and one of its most powerful features for writing reusable, type-safe code is generics. Generics allow you to write classes and functions that can operate on a wide range of types while preserving type information for tools like pyright and IDEs.

Take this simplified example.

from typing import Generic, TypeVar, Optional, Mapping, Any


T = TypeVar("T")


class Output(Generic[T]):
   def __init__(
       self,
       value: T,
       *,
       tags: Optional[Mapping[str, str]] = None,
   ): ...

This defines `Output` as a generic class, where the type variable `T` represents the type of the `value` being wrapped. When using the class, callers specify what `T`  should be:

Output[int]
Output[str]

Enhanced Generators

You may already be familiar with generators and their benefits when building memory-efficient iterables. But that’s just scratching the surface. Generators are a powerful feature with many applications beyond producing values in a `for` loop.

In fact one application of generators has its own name “enhanced generators.” This pattern allows the generator to yield control back to a caller while retaining state, and then resume execution later, perfect for resource management scenarios.

In Dagster, we use this pattern to define resource lifecycle hooks for `ConfigurableResource` classes.

import dagster as dg
from typing import Generator
import contextlib


class MyResource(dg.ConfigurableResource):
   @contextlib.contextmanager
   def yield_for_execution(
       self, context: dg.InitResourceContext
   ) -> Generator["MyResource", None, None]:
       print("setup_for_execution")
       yield self
       print("teardown_after_execution")

Partial Functions

In large codebases like Dagster, there are many situations where you want to enforce consistent behavior across different parts of the system. A clean and effective way to do this in Python is by using partial functions.

Python's `functools.partial` allows you to create a new version of a function with one or more arguments pre-filled. This is a form of function specialization, and in broader software engineering, the general idea of transforming functions with multiple arguments into specialized ones is related to currying.

At Dagster, we use this approach to enforce consistent JSON behavior:

from functools import partial
from json import (
   dump as dump_,
   dumps as dumps_,
   load as load_,
   loads as loads_,
)


dump = partial(dump_, sort_keys=True)
dumps = partial(dumps_, sort_keys=True)
load = partial(load_, strict=False)
loads = partial(loads_, strict=False)

Specifically, we ensure that:

  • JSON is serialized with sorted keys (sort_keys=True)
  • JSON is parsed leniently (strict=False)

This could also be done using wrapper functions.

def dumps(data):
   return json.dumps(data, sort_keys=True)

But `partial` offers a cleaner and more efficient way to achieve the same result, with no extra function call overhead or manual argument passing.

You should consider this approach when you want to create pre-configured versions of standard functions and enforce consistent defaults.

Improving your Python

These are just a few of the advanced Python features you’ll encounter in Dagster. While not every feature is used extensively, some may appear only a handful of times across tens of thousands of lines of code, having a deeper understanding of what Python offers can be incredibly valuable. It not only prepares you to navigate and extend more advanced codebases, but also helps you push beyond the limits of the standard library when necessary.

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 13, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

Announcing the Dagster+ Terraform Provider
Announcing the Dagster+ Terraform Provider
Blog

April 28, 2026

Announcing the Dagster+ Terraform Provider

The Dagster+ Terraform provider lets platform teams manage deployments, access controls, alerting, and more as code. Define entire environments declaratively, review changes through pull requests, and integrate Dagster+ into your existing infrastructure workflows.

The Missing Half of the Enterprise Context Layer
The Missing Half of the Enterprise Context Layer
Blog

April 22, 2026

The Missing Half of the Enterprise Context Layer

AI agents that only understand business definitions without knowing whether the underlying pipeline actually succeeded are confidently wrong and operational context from the orchestrator is the missing piece.

How to Orchestrate Across Multiple Databricks Workspaces Without Losing Your Mind
How to Orchestrate Across Multiple Databricks Workspaces Without Losing Your Mind
Blog

April 20, 2026

How to Orchestrate Across Multiple Databricks Workspaces Without Losing Your Mind

Once your pipelines span multiple Databricks workspaces, you're no longer orchestrating a single system you're coordinating a distributed one.

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on How to Scale Data Teams
Guide

November 5, 2025

Download the eBook on How to Scale Data Teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the eBook Primer on How to Build Data Platforms
Guide

February 21, 2025

Download the eBook Primer on How to Build Data Platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.

AI Driven Data Engineering
Course

March 19, 2026

AI Driven Data Engineering

Learn how to build Dagster applications faster using AI-driven workflows. You'll use Dagster's AI tools and skills to scaffold pipelines, write quality code, and ship data products with confidence while still learning the fundamentals.

Dagster & ETL
Course

July 11, 2025

Dagster & ETL

Learn how to ingest data to power your assets. You’ll build custom pipelines and see how to use Embedded ETL and Dagster Components to build out your data platform.

Testing with Dagster
Course

April 21, 2025

Testing with Dagster

In this course, learn best practices for testing, including unit tests, mocks, integration tests and applying them to Dagster.