Get the right tools for running Dagster with ADLS2 and Blob Storage
Get utilities for ADLS2 and Blob Storage.
About this integration
Dagster helps you use Azure Storage Accounts as part of your data pipeline. Azure Data Lake Storage Gen 2 (ADLS2) is our primary focus but we also provide utilities for Azure Blob Storage.
Installation
pip install dagster-azure
Examples
# Store your software-defined assets in ADLS2
from dagster import asset, repository, with_resources
from dagster_azure.adls2 import adls2_pickle_io_manager, adls2_resource
import pandas as pd
@asset
def asset1():
return pd.DataFrame()
@asset
def asset2(asset1):
return df[:5]
@repository
def repo():
return with_resources(
[asset1, asset2],
resource_defs={
"io_manager": adls2_pickle_io_manager.configured(
{"adls2_file_system": "my-cool-fs", "adls2_prefix": "my-cool-prefix"}
),
"adls2": adls2_resource,
},
)
About Azure
Microsoft Azure is a cloud computing platform operated by Microsoft for application management via Microsoft-managed data centers.