About this integration
This integration provides an integration with the DuckDB database and Polars data processing library.
Installation
pip install dagster-duckdb-polars
Example
from dagster_duckdb_polars import DuckDBPolarsIOManager
@asset(
key_prefix=["my_schema"] # will be used as the schema in DuckDB
)
def my_table() -> pl.DataFrame: # the name of the asset will be the table name
...
defs = Definitions(
assets=[my_table],
resources={"io_manager": DuckDBPolarsIOManager(database="my_db.duckdb")}
)
About DuckDB and Polars
DuckDB is a column-oriented embeddable OLAP database. A typical OLTP relational database like SQLite is row-oriented. In row-oriented database, data is organised physically as consecutive tuples.
Polars is a lightning fast DataFrame library/in-memory query engine. Its parallel execution, cache efficient algorithms and expressive API makes it perfect for efficient data wrangling, data pipelines, snappy APIs and so much more.