Data Engineering Python Guides
Whether you are new to data pipelines and Python projects or brushing up on core Python concepts, the following guides and examples should help you ramp up quickly.
Also, check out the Data Engineering Glossary, complete with Python code examples.
Python Packages: a Primer for Data People (part 1 of 2)
Explores the basics of Python modules, Python packages and how to import modules into your own projects.
Python Packages: a Primer for Data People (part 2 of 2)
We will be discussing life after the MDS and unveiling major new capabilities on the Dagster platform. Join us.
Best Practices in Structuring Python Projects.
Covers 9 best practices on structuring your projects, with examples.
From Python Projects to Dagster Pipelines.
Explores setting up a Dagster project, and the key concept of Data Assets.
Environment Variables in Python.
Cover the importance of environment variables and how to use them.
Factory Patterns
Learn design patterns, reusable solutions to common problems in software design.
Write-Audit-Publish in data pipelines
We look at a design pattern frequently used in ETL to ensure data quality and reliability.
CI/CD and Data Pipeline Automation (with Git)
Learn how to automate data pipelines and deployments with Git
High-performance Python for Data Engineering
Learn how to code data pipelines for performance.
Breaking Packages in Python
Exploring the sharp edges of Python’s system of imports, modules, and packages.
These are all the guides we have for now, but we have several more planned. Join our newsletter and stay in the loop.