88 Posts
May 24, 2023
Dagster and the Decade of Data Engineering
We are pleased to announce Elementl's $33M Series B and share our vision for what's next for Dagster and the practice of data engineering.
- Name
- Nick Schrock
- Handle
- @schrockn
May 24, 2023
Elementl Raises $33 Million in Series B Funding to Accelerate Data Orchestration and Unleash Advanced Data Use Cases
The new capital will accelerate the development and adoption of Dagster, the open-source, cloud-native data ...
May 23, 2023
Building Better Analytics Pipelines
A recap of our live event on the benefits and techniques for orchestrating analytics pipelines.
- Name
- Pete Hunt
- Handle
- @floydophone
- Name
- Yuhan Luo
- Handle
- @yuhan
May 19, 2023
Introducing Dynamic Definitions for Flexible Asset Partitioning
Dagster’s dynamic partition definitions allow engineers to use the power of partitions in a broader range of scenarios.
- Name
- Claire Lin
- Handle
- Name
- Sandy Ryza
- Handle
- @s_ryz
May 17, 2023
Deciphering Arcane Kubernetes and ECS Errors with Dagster
Recent enhancements allow Dagster to surface clearer and more actionable errors to accelerate your development cycles.
- Name
- Daniel Gibson
- Handle
May 16, 2023
Config Systems: Airflow and Dagster
Contrasting the Airflow and Dagster configuration systems by rewriting the Airflow Slack Integration.
- Name
- Joe Van Drunen
- Handle
May 9, 2023
How to Maintain High Product & Code Quality As Your Startup Scales
Raising the quality bar requires process adjustments and a cultural shift.
- Name
- Bosmat Eldar
- Handle
- @bosmat
Apr 26, 2023
Dagster 1.3: Smooth Operator
Enhanced partitioned asset support and the introduction of Pythonic config and resources, and integration updates.
- Name
- Yuhan Luo
- Handle
- @yuhan
Apr 21, 2023
Catalyst Cooperative: Liberating Public Utility Data with Dagster
The PUDL Project cleans and distributes analysis-ready energy system data to climate advocates, researchers, ...
- Name
- Fraser Marlow
- Handle
- @frasermarlow
Apr 14, 2023
From Python Projects to Dagster Pipelines
In part IV of our series, we explore setting up a Dagster project, and the key concept of Data Assets.
- Name
- Elliot Gunn
- Handle
- @elliot
Apr 10, 2023
Enabling Large-scale, Multi-cloud Computing with Dagster
Abstracting away infrastructure concerns in large-scale computing with conditional multi-cloud processing.
- Name
- Fraser Marlow
- Handle
- @frasermarlow
Apr 4, 2023
Orchestrate Meltano Jobs with Dagster
Meltano provides 550 connectors and tools, all of which can be configured and orchestrated straight from Dagster.
- Name
- Fraser Marlow
- Handle
- @frasermarlow
Apr 3, 2023
Community Memo: Pythonic Config and Resources
Major ergonomic improvements are coming to Dagster's config and resources systems, including a Pydantic frontend.
- Name
- Nick Schrock
- Handle
- @schrockn
- Name
- Ben Pankow
- Handle
Mar 21, 2023
Best Practices in Structuring Python Projects
An introduction to managing Python dependencies and some virtual environment best practices.
- Name
- Elliot Gunn
- Handle
- @elliot
Mar 20, 2023
Partitions in Data Pipelines
Partitioning is a technique that helps data engineers and ML engineers organize data and the computations that produce ...
- Name
- Sandy Ryza
- Handle
- @s_ryz
Mar 16, 2023
Tracking the Fake GitHub Star Black Market with Dagster, dbt and BigQuery
It's easy for an open-source project to buy fake GitHub stars. We share two approaches for detecting them.
- Name
- Fraser Marlow
- Handle
- @frasermarlow
- Name
- Yuhan Luo
- Handle
- @yuhan
Mar 9, 2023
Dagster 1.2: Formation
Enhanced partitioned asset support and the introduction of Pythonic config and resources, and integration updates.
- Name
- Fraser Marlow
- Handle
- @frasermarlow
Mar 7, 2023
How We Deploy 5X Faster with Warm Docker Containers
Using pex, Serverless Dagster Cloud now deploys 4 to 5 times faster by avoiding the overhead of building and launching ...
- Name
- Shalabh Chaturvedi
- Handle
Mar 6, 2023
Python Packages: a Primer for Data People (part 2 of 2)
An introduction to managing Python dependencies and some virtual environment best practices.
- Name
- Elliot Gunn
- Handle
- @elliot
Mar 6, 2023
Python Packages: a Primer for Data People (part 1 of 2)
The foundation of a solid Python project is mastering modules, packages and imports.
- Name
- Elliot Gunn
- Handle
- @elliot
Feb 28, 2023
Dagster Integrations Update
Dagster offers 47 integrations to accelerate your development, and we are working hard to expand and enhance them.
- Name
- Rex Ledesma
- Handle
- @_rexledesma
Feb 8, 2023
Migrating from Airflow to Dagster is now a Breeze
The newly released `dagster-airflow` library has made migrating off legacy Airflow and onto Dagster much easier.
- Name
- Joe Van Drunen
- Handle
Jan 9, 2023
Build a GitHub Support Bot with GPT3, LangChain, and Python
Tap into the power of OpenAI to answer your users technical questions.
- Name
- Pete Hunt
- Handle
- @floydophone
Dec 22, 2022
Converting an ETL Script to Software-Defined Assets
Lets talk about moving from an ETL script to a robust Dagster pipeline using Software-Defined Assets.
- Name
- Pete Hunt
- Handle
- @floydophone
Dec 16, 2022
Bringing Declarative Scheduling to dbt with Dagster
Declarative Scheduling takes the orchestration of dbt models as part of a larger pipeline to an entirely new level.
- Name
- Sean Lopp
- Handle
- @lopp
Dec 14, 2022
Troubleshooting Productionalized Notebooks using Dagster and Noteable
In this recorded webinar the Noteable + Dagster team walk you through how to run and debug a simple pipeline using ...
- Name
- Jamie DeMaria
- Handle
Dec 14, 2022
Dagster 1.1: Thank U, Next
A major release with Declarative Scheduling, multi-asset scheduling, and SDA partitioning. Plus Secrets management, ...
- Name
- Sandy Ryza
- Handle
- @s_ryz
Dec 8, 2022
Declarative Scheduling for Data Assets
Declarative Scheduling allows you to escape writing workflows entirely. Instead, you specify how up-to-date you expect ...
- Name
- Sandy Ryza
- Handle
- @s_ryz
Dec 7, 2022
Evaluating Dagster for Better Skiing - and a New Job
How quickstart projects snowball into new careers. A common data PoC walkthrough with Dagster.
- Name
- Sean Lopp
- Handle
- @lopp
Nov 30, 2022
Getting Stuff Done: a Guide to Productive Software Engineering
To be a more productive software engineer you need to master changes, how these affect the program and others on the ...
- Name
- Alex Langenfeld
- Handle
- @alex_langenfeld
Nov 21, 2022
Safe and Easy: Managing Secrets in Dagster Cloud
Dagster Cloud’s new Environment Variables UI makes it easy to set up scoped environment variables.
- Name
- Erin Cochran
- Handle
- Name
- Daniel Gibson
- Handle
Nov 18, 2022
My Path to Elementl - Part 2
Pete Hunt takes over as CEO as Nick Schrock takes on the CTO role.
- Name
- Pete Hunt
- Handle
- @floydophone
Nov 11, 2022
Pushing REST-API data to Google Sheets with Dagster
A total beginners tutorial in which we store REST API data in Google Sheets and learn some key abstractions.
- Name
- Fraser Marlow
- Handle
- @frasermarlow
Nov 7, 2022
Adding Types to a Large Python Codebase
We decided to drive Dagster to a 100%-typed public interface. This turned out to be a significant undertaking. Lessons ...
- Name
- Sean Mackesey
- Handle
Nov 2, 2022
Running Data Science Notebooks with Dagster: a Noteable integration
The Noteable team adds major powerups for data scientists looking to orchestrate Notebooks with Dagster
- Name
- Jamie DeMaria
- Handle
Oct 31, 2022
Orchestrating Machine Learning Pipelines with Dagster
To boost your ML efforts, improve your pipeline as well as your model.
- Name
- Sandy Ryza
- Handle
- @s_ryz
Oct 27, 2022
Orchestrating Data Science at Zephyr AI
Zephyr AI applies data science to massive datasets of DNA and healthcare records to deliver novel AI-driven insights.
- Name
- Fraser Marlow
- Handle
- @frasermarlow
Oct 25, 2022
Build a poor man’s data lake from scratch with DuckDB
DuckDB is so hot right now. Could it replace our cloud data warehouses or data lakes?
- Name
- Pete Hunt
- Handle
- @floydophone
- Name
- Sandy Ryza
- Handle
- @s_ryz
Oct 19, 2022
The Unreasonable Effectiveness of Data Pipeline Smoke Tests
Data practitioners waste time writing unit tests to catch bugs they could have caught with smoke tests.
- Name
- Sandy Ryza
- Handle
- @s_ryz
Oct 17, 2022
Web Workers are not the Answer
A tale of overstretched logs, counterintuitive web worker behavior, and ultimately a troublesome cursor issue.
- Name
- Marco Salazar
- Handle
- @BkOptimism
- Name
- Alex Langenfeld
- Handle
- @alex_langenfeld
Oct 16, 2022
Dagster at all 5 Steps of the Development Lifecycle
Dagster facilitates a data engineers work across all five steps in the development lifecycle.
Oct 6, 2022
A Dagster Crash Course
If you are looking to get up and running with Dagster in 10 minutes or less, this is a good place to start. Buckle up.
- Name
- Pete Hunt
- Handle
- @floydophone
Oct 4, 2022
Postgres: a Better Message Queue than Kafka?
When lots of event logs must be stored and indexed, Kafka is the obvious choice. Naturally, our queue runs on Postgres.
- Name
- Pete Hunt
- Handle
- @floydophone
Sep 20, 2022
Dagster vs. Airflow
We often get asked why a data team should choose Dagster over Apache Airflow. We compare Dagster and Airflow for data ...
- Name
- Sandy Ryza
- Handle
- @s_ryz
- Name
- Nick Schrock
- Handle
- @schrockn
Aug 24, 2022
How EvolutionIQ Rebuilt its ML Platform for Enormous Productivity.
A guide for CIOs/CTOs and engineering leaders looking to master the Modern Data Stack and develop a high performance ...
- Name
- Fraser Marlow
- Handle
- @frasermarlow
Aug 17, 2022
Spend Less Time Debugging with Dagster
It’s not uncommon for a data engineer to devote 80% of their day to debugging. Dagster radically improves on this.
- Name
- Sandy Ryza
- Handle
- @s_ryz
- Name
- Owen Kephart
- Handle
Aug 9, 2022
Launching Dagster Cloud to GA
The enterprise orchestration platform that puts developer experience first: hybrid or serverless deployments, native ...
- Name
- Nick Schrock
- Handle
- @schrockn
Aug 5, 2022
Introducing Dagster 1.0: Hello
Announcing Dagster 1.0. - a stable foundation for building the orchestration layer for modern data platforms.
- Name
- Sandy Ryza
- Handle
- @s_ryz
Aug 3, 2022
The Open Core Business Model
The relationship between Dagster, the open-source project, and Dagster Cloud, our hosted SaaS platform.
- Name
- Nick Schrock
- Handle
- @schrockn
Jul 26, 2022
Dagster Cloud goes SOC 2
Elementl, the company behind the Dagster data orchestration tool achieves SOC2 compliance.
- Name
- Selina Li
- Handle
Jul 25, 2022
Dagster Day: Announcing Dagster 1.0 and Dagster Cloud
The release of Dagster 1.0 and the GA launch of Dagster Cloud represent major milestones in the evolution of our ...
- Name
- Nick Schrock
- Handle
- @schrockn
Jul 12, 2022
Roman Roads in Data Engineering: Don't Write Data Pipelines from Scratch
Work in a way that lays the foundation for your next data product while you're building your current one.
- Name
- Claire Lin
- Handle
- Name
- Sandy Ryza
- Handle
- @s_ryz
Jun 23, 2022
The Data Exchange: Software-defined Assets
Nick Schrock on software-defined assets, a new approach to managing, maintaining, and orchestrating data declaratively.
- Name
- Nick Schrock
- Handle
- @schrockn
Jun 22, 2022
My Path to Elementl: Pete Hunt
Pete Hunt discusses what caused him to make the leap from Twitter to Elementl.
- Name
- Pete Hunt
- Handle
- @floydophone
Jun 20, 2022
Orchestrating Python and dbt with Dagster
How asset-focused orchestration bridges the gap between some of data's most popular tools.
- Name
- Owen Kephart
- Handle
Jun 15, 2022
Dagster 0.15.0: Cool for the Summer
In 0.15.0, software-defined assets are now marked fully stable and are ready for primetime.
- Name
- Mollie Pettit
- Handle
- @MollzMP
Mar 9, 2022
New in 0.14.0: Dagster-Airbyte Integration
0.14.0 introduces a deep integration with Airbyte: view Airbyte logs directly in Dagit, and every updated table will be ...
- Name
- Owen Kephart
- Handle
Mar 1, 2022
Introducing Software-Defined Assets
Software-Defined Assets are a transformative new abstraction that allows data teams to focus on the end-product not the ...
- Name
- Sandy Ryza
- Handle
- @s_ryz
Mar 1, 2022
Dagster 0.14.0: Table Schema API + Pandera Integration
Introducing two asset observability-enhancing features: Table Schema API, and an integration with the dataframe ...
- Name
- Sean Mackesey
- Handle
Mar 1, 2022
Dagster 0.14.0: Never Felt Like This Before
We’re thrilled to release version 0.14.0 of Dagster. This version introduces much more mature version of ...
- Name
- Mollie Pettit
- Handle
- @MollzMP
Feb 17, 2022
Rebundling the Data Platform
'The Unbundling of Airflow' argued that modern data stack solutions (data ingestion, data transformation, reverse ETL) ...
- Name
- Nick Schrock
- Handle
- @schrockn
Dec 2, 2021
Introducing Dagster Cloud
Dagster Cloud, the enterprise orchestration platform that puts developer experience first, with fully serverless or ...
- Name
- Nick Schrock
- Handle
- @schrockn
Nov 20, 2021
Laying the Foundation of your Data Platform for the Era of Big Complexity
Listen to founder and CEO Nick Schrock talk about how Dagster helps tame the complexity and scale when working with ...
- Name
- Nick Schrock
- Handle
- @schrockn
Nov 17, 2021
Hello Big Complexity: Is Your Modern Data Stack Ready?
Listen to Nick Schrock discuss the evolution of data from Big Data to Big Complexity in this episode of the Mad Data ...
- Name
- Nick Schrock
- Handle
- @schrockn
Nov 16, 2021
Why Elementl and Dagster: The Decade of Data
Announcing our $14M Series A led by Index Ventures, alongside Sequoia Capital, Slow Ventures, Coatue, Amplify Partners, ...
- Name
- Nick Schrock
- Handle
- @schrockn
Nov 8, 2021
New in Dagster 0.13.0: Logging Improvements!
Logging without context, instance-wide handlers, capturing python logs, and more! Learn about the improvements we've ...
- Name
- Owen Kephart
- Handle
Oct 28, 2021
Dagster 0.13.0: A New Foundation
We’re proud to announce 0.13.0 of Dagster with dramatic improvements to our core APIs, completely revamped UI, and ...
- Name
- Nick Schrock
- Handle
- @schrockn
Aug 10, 2021
Community Memo: Moving Dagster's Core APIs Towards 1.0
Dagster commits to a stable set of production-ready APIs for building solid data platforms.
- Name
- Sandy Ryza
- Handle
- @s_ryz
Jul 19, 2021
Dagster 0.12.0: Into the Groove
In 0.12.0, we introduce pipeline failure sensors, solid-level retries, and more convenient testing APIs.
- Name
- Owen Kephart
- Handle
May 25, 2021
Community Memo: Approachability Improvements
In the last two months, we've made a set of changes aimed at making Dagster more approachable: to smooth out its ...
- Name
- Sandy Ryza
- Handle
- @s_ryz
May 18, 2021
Incrementally Adopting Dagster at Mapbox
At Mapbox, we've adopted Dagster without breaking compatibility with our legacy Airflow systems -- and with huge gains ...
- Name
- Ben Pleasanton
- Handle
May 13, 2021
Moving past Airflow: Why Dagster is the Next-generation Data Orchestrator
A comparison between Dagster and Airflow. Here we detail the differences between the two systems, and make the case for ...
- Name
- Nick Schrock
- Handle
- @schrockn
Apr 1, 2021
Dagster 0.11.0: Lucky Star
In 0.11.0, we introduce dynamic orchestration, a new backfill UI, and support for tracking asset lineage.
Mar 15, 2021
Building Shared Spaces for Data Teams at Drizly
Our small data infrastructure team built a data platform that supports users with different skillsets, letting anyone ...
- Name
- Dennis Hume
- Handle
Jan 19, 2021
Dagster 0.10.0: The Edge of Glory
In 0.10.0, we introduce unique event-based scheduling capabilities, hardened deployments on Kubernetes, and new ...
- Name
- Nick Schrock
- Handle
- @schrockn
- Name
- Max Gasner
- Handle
- @gasnerpants
Dec 9, 2020
Good Data at Good Eggs: Using Dagster to Manage the Data Platform
Running pipelines is only part of running a data platform. We need to manage the platform and control technical debt. ...
- Name
- David Wallace
- Handle
- @davidjwallace
Nov 5, 2020
Good Data at Good Eggs: Data Observability with the Asset Catalog
Dagster gives us a single "pane of glass" for data assets. Analysts can look up when a Stitch raw data ingest occurred, ...
- Name
- David Wallace
- Handle
- @davidjwallace
Oct 29, 2020
Dagster and dbt: Better Together
People sometimes ask us — should I use Dagster, or should I use dbt? We view Dagster and dbt as complementary ...
- Name
- AJ Nadel
- Handle
- @AJ_Nadel
- Name
- Bob Chen
- Handle
- @bobchen168
Oct 1, 2020
Good Data at Good Eggs: Data Infrastructure Correctness and Reliability
Dagster’s custom data types helped achieve correctness and reliability in our data ingest process, less downstream ...
- Name
- David Wallace
- Handle
- @davidjwallace
Oct 1, 2020
Good Data at Good Eggs: Part 1 of 4
Adopting Dagster transformed our data platform team. We hope our experience is encouraging to other teams facing ...
- Name
- David Wallace
- Handle
- @davidjwallace
Sep 16, 2020
Testing and Deploying PySpark Jobs with Dagster
Spark has a beautiful API but developing with it is a pain because different stages of development and deployment ...
- Name
- Sandy Ryza
- Handle
- @s_ryz
Sep 15, 2020
Community Memo: September 2020 Update
A retrospective of our 0.9.0 release, a preview of our 0.10.0 roadmap, and Prezi's journey from a homegrown ...
Sep 10, 2020
Great Expectations for Dagster
We’re thrilled to announce a new integration between Dagster and a fellow open-source project, Great Expectations (GE).
- Name
- Leor Fishman
- Handle
- @fishmanl
Aug 25, 2020
Forward Thinking Leaders
Nick Schrock shares insights on how to on how to sell new tech concepts to developers.
- Name
- Nick Schrock
- Handle
- @schrockn
Aug 11, 2020
Dagster: The Data Orchestrator
As a workflow engine, Dagster moves beyond ordering and executing data computations. It introduces a new primitive: a ...
- Name
- Nick Schrock
- Handle
- @schrockn
- Name
- Max Gasner
- Handle
- @gasnerpants
Feb 26, 2020
Dagster 0.7.0: Waiting To Exhale
With 0.7.0 we set out improve the Dagster experience with large, production-scale pipelines, deployable to Kubernetes.
Oct 10, 2019
Dagster 0.6.0: Impossible Princess
Dagster 0.6.0 comes “batteries-included” and pluggable options to execute, monitor, schedule, deploy, and debug your ...
Jul 8, 2019
Introducing Dagster
Elementl announces an early release of Dagster, an open-source library for building ETL processes, ML pipelines and ...