December 14, 20221 minute read

Troubleshooting Productionalized Notebooks using Dagster and Noteable

Jamie DeMaria
Name
Jamie DeMaria
Handle

In an earlier blog post entitled Running data science notebooks with Dagster I shared the details of our Noteable integration.

Subsequently, we had the opportunity to do a presentation at PyData NYC in 2022. You can now walk through the workshop with us here as we demonstrate how to build and debug a data pipeline using Noteable and Dagster on Gitpod.

Noteable's CTO & Co-Founder Matthew Seal and Elementl's Jamie DeMaria conduct a live walkthrough of debugging data pipelines using Noteable and Dagster - December 2022.

Why run notebooks on Noteable & Dagster:

Data engineers waste a lot of time troubleshooting long-running pipelines and know only too well the frustration of minor errors consuming hours of work. In this practical tutorial we demonstrate an approach for dramatically shortening testing cycles and reducing the number of reruns required, boosting developer/practitioner productivity, and reducing frustration on the team.

A productionalized notebook integrated with an orchestration platform provides an excellent balance of reproducibility, flexibility, and intent in a way that will be quickly consumable. This tutorial is valuable to data scientists and data engineers. This setup makes it easy to take Jupyter notebooks from exploratory to production, but even easier to debug and ensure quality over time.

This tutorial will show how you can achieve:

  • Time-saving in initiating jobs: Allowing users to seamlessly transition an exploratory workflow created within a Noteable notebook, into a productionalized scheduled workflow in Dagster.
  • Time and cost saving for debugging failed runs: Allowing users to immediately dive into a live running notebook at the point of failure, with all of the in-memory state preserved. This saves the users' time, as well as saves companies' compute costs by not requiring debugging to re-execute previous steps of the workflow.

If you want to support the Dagster Open Source project, be sure to Star our Github repo.


We're always happy to hear your feedback, so please reach out to us! If you have any questions, ask them in the Dagster community Slack (join here!) or start a Github discussion. If you run into any bugs, let us know with a Github issue. And if you're interested in working with us, check out our open roles!

Follow us:


Read more filed under
Integration