Advanced Python features that AI agents may miss
Python is a joy to write. Part of the appeal is just how flexible it is. Thanks to its rich standard library, dynamic nature, and multiple paradigms, there are many ways to solve problems.
But flexibility can also hide some of the more interesting and powerful features of the language, especially if you’re primarily writing code with the help of AI agents. While AI agents can be incredibly helpful, they often default to safe, conventional solutions rather than code that really push the bounds of Python.
This doesn’t mean you should reach for the most advanced feature every time you write code. As the Zen of Python reminds us:
Simple is better than complex. Complex is better than complicated.
Still, it’s worth being aware of the deeper parts of the language. Many of Python’s advanced features exist for good reason, they power real-world, production-grade systems and can make your code more expressive, maintainable, or performant when used correctly.
We’ll highlight some of the advanced Python features we use ourselves across the Dagster codebases and explore how they might help you to elevate your code.
Slots
One Python feature your agent is unlikely to use are `__slots__`. By default, all Python class instances store their attributes in a per-instance dictionary (`__dict__`). This is part of what makes Python so flexible since each instance can be modified at runtime by adding new attributes dynamically.
class weekday(object):
def __init__(self, weekday, n=None):
self.weekday = weekday
self.n = n
my_weekday = weekday("Monday")
my_weekday.new_attribute = 1
This dynamic behavior is convenient, but it comes at a cost: every instance carries the overhead of maintaining a `__dict__`. If we know in advance that the set of attributes on an object will not change, we can use `__slots__`.
class weekday(object):
__slots__ = ["weekday", "n"]
def __init__(self, weekday, n=None):
self.weekday = weekday
self.n = n
This version of `weekday` behaves almost the same, but prevents dynamic attribute assignment.
my_weekday.new_attribute = 1
# AttributeError: 'weekday' object has no attribute 'new_attribute'
The tradeoff is worthwhile when you’re creating a large number of instances, as it significantly reduces memory usage by eliminating each instance's `__dict__`.
At Dagster, we use `__slots__` in performance-critical paths such as creating many immutable objects during background daemon processes. This can dramatically reduce memory usage and improve execution efficiency.
And even though `__slots__` are a fairly old feature of Python, having been around since Python 2.2, they have continued to evolve with the language. Since Python 3.10+ you can use `__slots__` as part of dataclasses to achieve the same memory savings.
from dataclasses import dataclass
@dataclass(slots=True)
class weekday:
weekday: str
n: int
lru_cache
Another effective way to improve performance is by using the `@lru_cache` decorator from the `functools` module. This decorator caches the return values of function calls, so repeated invocations with the same arguments can return instantly, without recomputing the result.
This is especially useful for expensive or computationally intensive functions where the output is deterministic and stable across calls.
For example, consider a method that performs introspection to determine whether a function accepts a context argument, a potentially expensive operation.
from typing import Any, NamedTuple
from functools import lru_cache
class DecoratedOpFunction(NamedTuple):
@lru_cache(maxsize=1)
def has_context_arg(self) -> bool:
...
In this case, `@lru_cache(maxsize=1)` is ideal because:
- The result of `has_context_arg` is invariant for the lifetime of the object.
- Caching the result avoids repeating a costly computation.
- `maxsize=1` is sufficient since the method is usually called on a single instance and doesn’t depend on arguments.
This simple addition can significantly reduce latency in performance-sensitive paths, especially when combined with immutable or repeatedly accessed data.
Protocols
You’re probably familiar with inheritance in Python, but you might not know about metaclasses and more specifically, the role of protocols, a special kind of static metatype.
Introduced in Python 3.8, `Protocol` is a feature that enables static structural subtyping, also known as "duck typing with types." Rather than relying on inheritance, it checks whether a class has the required methods and properties to satisfy an interface.
At Dagster, we aim to make our code fully compliant with static type checkers like pyright (though we’re also very excited about ty and other emerging tools). Using `Protocol` allows us to define type-safe interfaces without enforcing inheritance, which helps us keep our code flexible and clean.
We often use protocols to define minimal contracts for objects like GraphQL results. Here's an example.
from typing import Protocol, Mapping, Any, Optional, Sequence
class GqlResult(Protocol):
@property
def data(self) -> Mapping[str, Any]: ...
@property
def errors(self) -> Optional[Sequence[str]]: ...
This protocol doesn't define any implementation. Instead, it specifies that any object with these two properties (`data` an `errors`) is considered a `GqlResult`. No subclassing is required and any class that implements the structure is valid.
You can now write functions that accept any object conforming to `GqlResult`, regardless of its actual class.
def _get_backfill_data(
launch_backfill_result: GqlResult,
):
...
This approach:
- Enforces type correctness at compile time.
- Avoids rigid class hierarchies.
Makes your code more testable and reusable.
Generics
Python's type system is highly expressive, and one of its most powerful features for writing reusable, type-safe code is generics. Generics allow you to write classes and functions that can operate on a wide range of types while preserving type information for tools like pyright and IDEs.
Take this simplified example.
from typing import Generic, TypeVar, Optional, Mapping, Any
T = TypeVar("T")
class Output(Generic[T]):
def __init__(
self,
value: T,
*,
tags: Optional[Mapping[str, str]] = None,
): ...
This defines `Output` as a generic class, where the type variable `T` represents the type of the `value` being wrapped. When using the class, callers specify what `T` should be:
Output[int]
Output[str]
Enhanced Generators
You may already be familiar with generators and their benefits when building memory-efficient iterables. But that’s just scratching the surface. Generators are a powerful feature with many applications beyond producing values in a `for` loop.
In fact one application of generators has its own name “enhanced generators.” This pattern allows the generator to yield control back to a caller while retaining state, and then resume execution later, perfect for resource management scenarios.
In Dagster, we use this pattern to define resource lifecycle hooks for `ConfigurableResource` classes.
import dagster as dg
from typing import Generator
import contextlib
class MyResource(dg.ConfigurableResource):
@contextlib.contextmanager
def yield_for_execution(
self, context: dg.InitResourceContext
) -> Generator["MyResource", None, None]:
print("setup_for_execution")
yield self
print("teardown_after_execution")
Partial Functions
In large codebases like Dagster, there are many situations where you want to enforce consistent behavior across different parts of the system. A clean and effective way to do this in Python is by using partial functions.
Python's `functools.partial` allows you to create a new version of a function with one or more arguments pre-filled. This is a form of function specialization, and in broader software engineering, the general idea of transforming functions with multiple arguments into specialized ones is related to currying.
At Dagster, we use this approach to enforce consistent JSON behavior:
from functools import partial
from json import (
dump as dump_,
dumps as dumps_,
load as load_,
loads as loads_,
)
dump = partial(dump_, sort_keys=True)
dumps = partial(dumps_, sort_keys=True)
load = partial(load_, strict=False)
loads = partial(loads_, strict=False)
Specifically, we ensure that:
- JSON is serialized with sorted keys (sort_keys=True)
- JSON is parsed leniently (strict=False)
This could also be done using wrapper functions.
def dumps(data):
return json.dumps(data, sort_keys=True)
But `partial` offers a cleaner and more efficient way to achieve the same result, with no extra function call overhead or manual argument passing.
You should consider this approach when you want to create pre-configured versions of standard functions and enforce consistent defaults.
Improving your Python
These are just a few of the advanced Python features you’ll encounter in Dagster. While not every feature is used extensively, some may appear only a handful of times across tens of thousands of lines of code, having a deeper understanding of what Python offers can be incredibly valuable. It not only prepares you to navigate and extend more advanced codebases, but also helps you push beyond the limits of the standard library when necessary.