On June 8th of this year, Sandy Ryza, lead engineer on the Dagster project gave a presentation at the DATA + AI Summit in San Francisco. The talk was entitled "The Future of Data Orchestration: Asset-Based Orchestration."
We are happy to share the key points of the talk in the video below.
Sandy's thesis: Data orchestration is a core component for any batch data processing platform and yet we’ve been using patterns that haven't changed since the 1980s. Sandy introduces a new pattern and way of thinking for data orchestration known as asset-based orchestration, with data freshness sensors to trigger pipelines.
- What are data pipelines?
- Why update a data asset?
- Automating asset updates
- Workflow engines: not the best way to schedule data pipelines
- Asset-based orchestration
- Building a pipeline with data assets
- Dealing with the root of the graph
- Freshness policies
- Asset observability
We're always happy to hear your feedback, so please reach out to us! If you have any questions, ask them in the Dagster community Slack (join here!) or start a Github discussion. If you run into any bugs, let us know with a Github issue. And if you're interested in working with us, check out our open roles!