Asset-Based Data Orchestration (from Data + AI Summit) | Dagster Blog

July 6, 20231 minute read

Asset-Based Data Orchestration (from Data + AI Summit)

Sandy Ryza
Name
Sandy Ryza
Handle
@s_ryz

On June 8th of this year, Sandy Ryza, lead engineer on the Dagster project gave a presentation at the DATA + AI Summit in San Francisco. The talk was entitled "The Future of Data Orchestration: Asset-Based Orchestration."

We are happy to share the key points of the talk in the video below.

Sandy's thesis: Data orchestration is a core component for any batch data processing platform and yet we’ve been using patterns that haven't changed since the 1980s. Sandy introduces a new pattern and way of thinking for data orchestration known as asset-based orchestration, with data freshness sensors to trigger pipelines.

Sandy Ryza present "The Future of Data Orchestration: Asset-Based Orchestration"

Agenda:

  • What are data pipelines?
  • Why update a data asset?
  • Automating asset updates
  • Workflow engines: not the best way to schedule data pipelines
  • Asset-based orchestration
    • Building a pipeline with data assets
    • Automaterialization
    • Dealing with the root of the graph
    • Freshness policies
    • Asset observability

The Dagster Labs logo

We're always happy to hear your feedback, so please reach out to us! If you have any questions, ask them in the Dagster community Slack (join here!) or start a Github discussion. If you run into any bugs, let us know with a Github issue. And if you're interested in working with us, check out our open roles!

Follow us:


Read more filed under
Blog post category for Blog Post. Blog Post