Machine Learning Pipelines Are Still Data Pipelines | Dagster Blog

January 3, 20241 minute read

Machine Learning Pipelines Are Still Data Pipelines

Sandy Ryza, Lead Engineer at Dagster Labs, talks data engineering for machine learning efforts.
Sandy Ryza
Name
Sandy Ryza
Handle
@s_ryz

In this episode of The Data Stack Show, hosts Eric and Kostas chat with Sandy Ryza, Lead Engineer at Dagster Labs.

Sandy shares insights on data cleaning, data engineering processes, and the need for improved tools. He introduces Dagster, an orchestrator that focuses on assets like tables, datasets, and machine learning models, and contrasts it with traditional workflow systems. He also explains Dagster’s integration with dbt, while also exploring the changing dynamics in data roles, the impact of modern tooling, the potential for increased creativity in the field, and more.


The Dagster Labs logo

We're always happy to hear your feedback, so please reach out to us! If you have any questions, ask them in the Dagster community Slack (join here!) or start a Github discussion. If you run into any bugs, let us know with a Github issue. And if you're interested in working with us, check out our open roles!

Follow us:


Read more filed under
Blog post category for Podcast. Podcast