Blog
What CoPilot Won’t Teach You About Python (Part 1)

What CoPilot Won’t Teach You About Python (Part 1)

July 23, 2025
What CoPilot Won’t Teach You About Python (Part 1)
What CoPilot Won’t Teach You About Python (Part 1)

Advanced Python features that AI agents may miss

Python is a joy to write. Part of the appeal is just how flexible it is. Thanks to its rich standard library, dynamic nature, and multiple paradigms, there are many ways to solve problems.

But flexibility can also hide some of the more interesting and powerful features of the language, especially if you’re primarily writing code with the help of AI agents. While AI agents can be incredibly helpful, they often default to safe, conventional solutions rather than code that really push the bounds of Python.

This doesn’t mean you should reach for the most advanced feature every time you write code. As the Zen of Python reminds us:

Simple is better than complex. Complex is better than complicated.

Still, it’s worth being aware of the deeper parts of the language. Many of Python’s advanced features exist for good reason, they power real-world, production-grade systems and can make your code more expressive, maintainable, or performant when used correctly.

We’ll highlight some of the advanced Python features we use ourselves across the Dagster codebases and explore how they might help you to elevate your code.

Slots

One Python feature your agent is unlikely to use are `__slots__`. By default, all Python class instances store their attributes in a per-instance dictionary (`__dict__`). This is part of what makes Python so flexible since each instance can be modified at runtime by adding new attributes dynamically.

class weekday(object):
   def __init__(self, weekday, n=None):
       self.weekday = weekday
       self.n = n


my_weekday = weekday("Monday")
my_weekday.new_attribute = 1

This dynamic behavior is convenient, but it comes at a cost: every instance carries the overhead of maintaining a `__dict__`. If we know in advance that the set of attributes on an object will not change, we can use  `__slots__`.

class weekday(object):
   __slots__ = ["weekday", "n"]


   def __init__(self, weekday, n=None):
       self.weekday = weekday
       self.n = n

This version of `weekday` behaves almost the same, but prevents dynamic attribute assignment.

my_weekday.new_attribute = 1
# AttributeError: 'weekday' object has no attribute 'new_attribute'

The tradeoff is worthwhile when you’re creating a large number of instances, as it significantly reduces memory usage by eliminating each instance's `__dict__`.

At Dagster, we use `__slots__` in performance-critical paths such as creating many immutable objects during background daemon processes. This can dramatically reduce memory usage and improve execution efficiency.

And even though `__slots__` are a fairly old feature of Python, having been around since Python 2.2, they have continued to evolve with the language. Since Python 3.10+ you can use `__slots__` as part of dataclasses to achieve the same memory savings.

from dataclasses import dataclass


@dataclass(slots=True)
class weekday:
   weekday: str
   n: int

lru_cache

Another effective way to improve performance is by using the `@lru_cache` decorator from the `functools` module. This decorator caches the return values of function calls, so repeated invocations with the same arguments can return instantly, without recomputing the result.

This is especially useful for expensive or computationally intensive functions where the output is deterministic and stable across calls.

For example, consider a method that performs introspection to determine whether a function accepts a context argument, a potentially expensive operation.

from typing import Any, NamedTuple
from functools import lru_cache


class DecoratedOpFunction(NamedTuple):


   @lru_cache(maxsize=1)
   def has_context_arg(self) -> bool:
       ...

In this case, `@lru_cache(maxsize=1)` is ideal because:

  • The result of `has_context_arg` is invariant for the lifetime of the object.
  • Caching the result avoids repeating a costly computation.
  • `maxsize=1` is sufficient since the method is usually called on a single instance and doesn’t depend on arguments.

This simple addition can significantly reduce latency in performance-sensitive paths, especially when combined with immutable or repeatedly accessed data.

Protocols

You’re probably familiar with inheritance in Python, but you might not know about metaclasses and more specifically, the role of protocols, a special kind of static metatype.

Introduced in Python 3.8, `Protocol` is a feature that enables static structural subtyping, also known as "duck typing with types." Rather than relying on inheritance, it checks whether a class has the required methods and properties to satisfy an interface.

At Dagster, we aim to make our code fully compliant with static type checkers like pyright (though we’re also very excited about ty and other emerging tools). Using `Protocol` allows us to define type-safe interfaces without enforcing inheritance, which helps us keep our code flexible and clean.

We often use protocols to define minimal contracts for objects like GraphQL results. Here's an example.

from typing import Protocol, Mapping, Any, Optional, Sequence


class GqlResult(Protocol):
  @property
  def data(self) -> Mapping[str, Any]: ...




  @property
  def errors(self) -> Optional[Sequence[str]]: ...

This protocol doesn't define any implementation. Instead, it specifies that any object with these two properties (`data` an `errors`) is considered a `GqlResult`. No subclassing is required and any class that implements the structure is valid.

You can now write functions that accept any object conforming to `GqlResult`, regardless of its actual class.

def _get_backfill_data(
   launch_backfill_result: GqlResult,
):
   ...

This approach:

  • Enforces type correctness at compile time.
  • Avoids rigid class hierarchies.
    Makes your code more testable and reusable.

Generics

Python's type system is highly expressive, and one of its most powerful features for writing reusable, type-safe code is generics. Generics allow you to write classes and functions that can operate on a wide range of types while preserving type information for tools like pyright and IDEs.

Take this simplified example.

from typing import Generic, TypeVar, Optional, Mapping, Any


T = TypeVar("T")


class Output(Generic[T]):
   def __init__(
       self,
       value: T,
       *,
       tags: Optional[Mapping[str, str]] = None,
   ): ...

This defines `Output` as a generic class, where the type variable `T` represents the type of the `value` being wrapped. When using the class, callers specify what `T`  should be:

Output[int]
Output[str]

Enhanced Generators

You may already be familiar with generators and their benefits when building memory-efficient iterables. But that’s just scratching the surface. Generators are a powerful feature with many applications beyond producing values in a `for` loop.

In fact one application of generators has its own name “enhanced generators.” This pattern allows the generator to yield control back to a caller while retaining state, and then resume execution later, perfect for resource management scenarios.

In Dagster, we use this pattern to define resource lifecycle hooks for `ConfigurableResource` classes.

import dagster as dg
from typing import Generator
import contextlib


class MyResource(dg.ConfigurableResource):
   @contextlib.contextmanager
   def yield_for_execution(
       self, context: dg.InitResourceContext
   ) -> Generator["MyResource", None, None]:
       print("setup_for_execution")
       yield self
       print("teardown_after_execution")

Partial Functions

In large codebases like Dagster, there are many situations where you want to enforce consistent behavior across different parts of the system. A clean and effective way to do this in Python is by using partial functions.

Python's `functools.partial` allows you to create a new version of a function with one or more arguments pre-filled. This is a form of function specialization, and in broader software engineering, the general idea of transforming functions with multiple arguments into specialized ones is related to currying.

At Dagster, we use this approach to enforce consistent JSON behavior:

from functools import partial
from json import (
   dump as dump_,
   dumps as dumps_,
   load as load_,
   loads as loads_,
)


dump = partial(dump_, sort_keys=True)
dumps = partial(dumps_, sort_keys=True)
load = partial(load_, strict=False)
loads = partial(loads_, strict=False)

Specifically, we ensure that:

  • JSON is serialized with sorted keys (sort_keys=True)
  • JSON is parsed leniently (strict=False)

This could also be done using wrapper functions.

def dumps(data):
   return json.dumps(data, sort_keys=True)

But `partial` offers a cleaner and more efficient way to achieve the same result, with no extra function call overhead or manual argument passing.

You should consider this approach when you want to create pre-configured versions of standard functions and enforce consistent defaults.

Improving your Python

These are just a few of the advanced Python features you’ll encounter in Dagster. While not every feature is used extensively, some may appear only a handful of times across tens of thousands of lines of code, having a deeper understanding of what Python offers can be incredibly valuable. It not only prepares you to navigate and extend more advanced codebases, but also helps you push beyond the limits of the standard library when necessary.

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

Unlocking the Full Value of Your Databricks
Unlocking the Full Value of Your Databricks
Blog

March 12, 2026

Unlocking the Full Value of Your Databricks

Standardizing on Databricks is a smart strategic move, but consolidation alone does not create a working operating model across teams, tools, and downstream systems. By pairing Databricks and Unity Catalog with Dagster, enterprises can add the coordination layer needed for dependency visibility, end-to-end lineage, and faster, more confident delivery at scale.

Announcing AI Driven Data Engineering
Announcing AI Driven Data Engineering
Blog

March 5, 2026

Announcing AI Driven Data Engineering

AI coding agents are changing how data engineers work. This Dagster University course shows how to build a production-ready ELT pipeline from prompts while learning practical patterns for reliable AI-assisted development.

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on how to scale data teams
Guide

November 5, 2025

Download the eBook on how to scale data teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the e-book primer on how to build data platforms
Guide

February 21, 2025

Download the e-book primer on how to build data platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.