Dagster vs. AWS Step Functions

AWS offers a visual workflow tool for distributed applications as part of it's ecosystem. Why should you opt for Dagster instead?

Get started with Dagster

Try Dagster+ for free

30-day trial. No credit card required.

The big differences between Dagster and Step Functions:

AWS Step Functions is a serverless orchestration service that lets you integrate with AWS Lambda functions and other AWS services to build business-critical application.

  • General-purpose workflow runner to execute AWS services
  • Used to manage infrastructure, AWS services, or applications
  • Build visually or organize tasks using AWS’s JSON State Language
  • Calls out to other services to execute user-defined code
  • Run either manually or via CloudWatch triggers

Dagster is specifically designed for data engineers.

  • Define critical data assets in code
  • Use a declarative approach to make your data engineering team far more productive
  • Provides local development, deep integrations with the modern data stack, and scheduling built around stakeholder SLAs.

Software Development Life Cycle and Developer Experience

Represent the data transformation logic in Python via Lambda functions.

  • The rest of the data platform (resource integrations, DAGs, and automation) is handled separately outside of code
  • Hard to run and test the data platform locally
  • Data processing code must be kept in sync with all other service configurations
  • Painful to write manually; most developers resort to drag-and-drop visual development, increasing the risk of code drifting from configuration. This limits what types of DAGs, resources, and integrations can be used.

The data transformation logic, resource integrations, DAGs, and pipeline automation are all defined and versioned in code.

  • Developers can define,review, test, and version every aspect of the data platform locally
  • Code PRs can include both changes to ETL logic and the definition of what warehouse the ETL runs on
  • Use Python for every aspect of the data platform, including unit testing, type checking, shared utility code, mock resources, and dynamic programming.

How ‘data aware’ are these systems?

AWS Step Functions are not aware of the datasets they create.

  • Do not provide any form of data lineage
  • New data assets must be wedged into existing pipelines
  • Dependencies across pipelines are not explicit.

With Dagster, data assets are first class citizens.

  • Full dataset lineage
  • Clear real-time status of each dataset
  • Pipeline schedules based on data freshness SLAs (i.e. hourly, daily, when upstream dependencies update, etc.)

To summarize the main differences between AWS Step Functions and Dagster:

Goal of the solutionHelp coordinate AWS services.Help data engineers define and manage critical data assets.
Run Python code reliably and provide flexibility for complex programming tasks

Step Functions defined in JSON call Python code in Lambdas.

  • Retries
  • Parallelization
  • Conditional execution
  • Logging
  • Dynamic fan out

Python function decorators create DAGs of assets.

  • Retries
  • Run queues, parallelization
  • Conditional execution
  • Logging
  • Dynamic fan out
Data assetsNo data asset abstraction.Asset-centric framework:
  • Global asset lineage
  • Partitions & backfills
  • Data SLAs
Automation

Schedules in CloudWatch with event triggers.

In Python Code:
  • Schedules
  • Sensors (Event-Driven)
  • Data SLAs
Software development lifecycle
  • Limited local development
  • Visual workflow builder
  • AWS specific JSON language
Best-in-class local development:
  • Branch deployments
  • Data sandboxes
  • Separation of logic/IO
  • Mock resources
Integrations

Blocks built around AWS services

Asset-first

Community

Dagster has a growing community of forward-thinking engineers who see the value of our differentiated approach. The Dagster engineering team is directly involved in supporting both open-source and Dagster+ users.

Interested in getting an objective 3rd party perspective? Join the Dagster Slack and interact with current users.

Join us on Slack
Dagster+ for Enterprise
Looking for unlimited deployments, advanced RBAC and SAML-based SSO, all on a SOC2 certified platform? Contact the Dagster Labs sales team today to discuss your requirements.