DuckDB + Polars | Dagster Integrations
Back to integrations
Dagster + DuckDB + Polars

Dagster + DuckDB + Polars

Read inputs from and write Polars DataFrames to DuckDB

About this integration

This integration provides an integration with the DuckDB database and Polars data processing library.

Installation

pip install dagster-duckdb-polars

Example

from dagster_duckdb_polars import DuckDBPolarsIOManager

@asset(
    key_prefix=["my_schema"]  # will be used as the schema in DuckDB
)
def my_table() -> pl.DataFrame:  # the name of the asset will be the table name
    ...

defs = Definitions(
    assets=[my_table],
    resources={"io_manager": DuckDBPolarsIOManager(database="my_db.duckdb")}
)

About DuckDB and Polars

DuckDB is a column-oriented embeddable OLAP database. A typical OLTP relational database like SQLite is row-oriented. In row-oriented database, data is organised physically as consecutive tuples.

Polars is a lightning fast DataFrame library/in-memory query engine. Its parallel execution, cache efficient algorithms and expressive API makes it perfect for efficient data wrangling, data pipelines, snappy APIs and so much more.