Oops! Something went wrong while submitting the form.
March 21, 2023
Best Practices in Structuring Python Projects
We cover 9 best practices and examples on structuring your Python projects for collaboration and productivity.
Elliot Gunn
Engineering
March 20, 2023
Partitions in Data Pipelines
Partitioning is a technique that helps data engineers and ML engineers organize data and the computations that produce that data.
Sandy Ryza
Engineering
March 16, 2023
Tracking the Fake GitHub Star Black Market with Dagster, dbt and BigQuery
It's easy for an open-source project to buy fake GitHub stars. We share two approaches for detecting them.
Fraser Marlow
Engineering
March 7, 2023
How Dagster Deploys 5X Faster with Warm Docker Containers
Using pex, Serverless Dagster Cloud now deploys 4 to 5 times faster by avoiding the overhead of building and launching Docker images.
Shalabh Chaturvedi
Engineering
March 6, 2023
Python Packages: a Primer for Data People (part 1 of 2)
The foundation of a solid Python project is mastering modules, packages and imports.
Elliot Gunn
Engineering
March 6, 2023
Python Packages: a Primer for Data People (part 2 of 2)
An introduction to managing Python dependencies and some virtual environment best practices.
Elliot Gunn
Engineering
January 9, 2023
Build a GitHub Support Bot with GPT3, LangChain, and Python
In this tutorial, we tap into the power of OpenAI's ChatGPT to build a GitHub support bot using GPT3, LangChain, and Python.
Pete Hunt
Engineering
November 30, 2022
Getting Stuff Done: a Guide to Productive Software Engineering
To be a more productive software engineer you need to master changes, how these affect the program and others on the team.
Alex Langenfeld
Engineering
November 11, 2022
Pushing REST-API data to Google Sheets with Dagster
A total beginners tutorial in which we store REST API data in Google Sheets and learn some key abstractions.
Fraser Marlow
Engineering
November 7, 2022
Adding Types to a Large Python Codebase
What we learned when we introduced dynamically typed code to a large Python codebase, bringing Dagster's public API to 100% type coverage.
Sean Mackesey
Engineering
October 31, 2022
Orchestrating Machine Learning Pipelines with Dagster
How to use Dagster’s open source data orchestrator to build machine learning pipelines and train ML models.
Sandy Ryza
Engineering
October 25, 2022
Build a poor man’s data lake from scratch with DuckDB
DuckDB is so hot right now. Learn how to build a data lake from dbt using DuckDB for SQL transformations, along with Python, Dagster, and Parquet files.