Back to integrations
Dagster + Great Expectations

Dagster + Great Expectations

Yield an expectation and its output with all relevant metadata.

About this integration

With this integration, you can build Great Expectations validations inside Dagster Ops.


pip install dagster-ge


from dagster_ge.factory import ge_data_context, ge_validation_op_factory
from pandas import read_csv
from dagster import job, op

def read_in_datafile(csv_path):
    return read_csv(csv_path)

def process_payroll(df):
    return len(df)

def postprocess_payroll(numrows, expectation):
    if expectation["success"]:
        return numrows
        raise ValueError

payroll_expectations = ge_validation_op_factory(
    name="ge_validation_op", datasource_name="getest", suite_name="basic.warning"

    resource_defs={"ge_data_context": ge_data_context},
        "resources": {
            "ge_data_context": {
                "config": {"ge_root_dir": "./great_expectations"}
        "ops": {
            "read_in_datafile": {
                "inputs": {
                    "csv_path": {"value": "./data/succeed.csv"}
def payroll_data():
    output_df = read_in_datafile()
    postprocess_payroll(process_payroll(output_df), payroll_expectations(output_df))

About Great Expectations

Great Expectations is a shared, open standard for data quality. It helps data teams eliminate pipeline debt, through data testing, documentation, and profiling.