August 9, 2022 • 4 minute read •
Launching Dagster Cloud to GA
- Name
- Nick Schrock
- Handle
- @schrockn
Dagster Cloud was launched to early access in December 2021. Since then we have onboarded dozens of customers ranging from startups to Fortune 500 companies, and the feedback has been extraordinary. Today we’re proud to announce the release of Dagster Cloud to general availability. Anyone in the world can sign up, and you can find the product details here.
Dagster Cloud is the enterprise orchestrator built uncompromisingly for developer experience. It is at the center of our work. Developer experience means not just raw productivity, but the management of complexity. Not only can you do your job faster, but it unburdens your mind, empowering you to think bigger and make entirely new things possible.
Developer experience has been the focus of Dagster since its inception. We focused on functional data engineering and first-class testing, with the goal of providing fast feedback loops and CI/CD for all data practitioners.
The open-source Dagster framework is making enormous progress. Innovations such as software-defined assets—our declarative approach to orchestration—start with the developer. They provide a higher level of abstraction that frees the mind to focus on business logic and the core job of practitioners: to keep high-quality data assets up to date for their stakeholders. Last week, we released Dagster 1.0, representing a new level of maturity and stability.
But a framework is not enough. To complete this vision we had to build a managed cloud product. Enter Dagster Cloud, which includes native branching directly, a managed service to offload ops burden from data teams, best-in-class UI that serves as a single pane of glass for all stakeholders, and enterprise features.
Native branching in the orchestrator
Developer workflow in data platforms is notoriously difficult. With sufficient effort—and with framework support in systems like Dagster—structuring code for local development is possible.
However, with only local development you are still limited in what you can test, as data pipelines depend on secrets, services, and data that are not accessible or runnable on your laptop. Teams are left to either test in production or to set up heavyweight, inflexible staging environments that are clumsy to use and costly to manage.
Dagster Cloud provides a novel type of environment: a Branch Deployment. It is a lightweight, staging environment created with every pull request that becomes a focal point of development, testing, and collaboration.
With Branch Deployments, you follow a familiar development workflow. You create a new branch, edit your code, and create a pull request. Then our out-of-the-box infrastructure takes over. On every push, it creates a lightweight staging environment from that PR where you can run and test your code. Not only that, but you can launch jobs on every push in order to do testing, or to configure the test environment itself like, for example, cloning databases.
The Dagster framework is designed with this in mind. It allows you to parameterize your pipelines and assets to point to test data in the branch deployment. In effect, you can branch your entire environment, providing a safe, collaborative, engineering workflow on top of your data platform.
When you have completed your work in your PR and validated your changes in the branch deployment, you merge back your changes to main, and then new code is deployed into production, and your branch deployment goes dormant.
This is a revolutionary capability in orchestration that not only makes existing engineers more productive, but opens up development in the orchestrator to an entire new set of stakeholders.
Offloading your ops with flexible, managed orchestration
Another pillar of developer experience is empowering the practitioner to do the job they were hired to do: to write business logic that creates data assets that deliver impact to their business or organization. That means offloading as much operational burden from that practitioner as possible.
Dagster Cloud provides unparalleled flexibility and range, meeting the engineer where they are. To that end, today we are launching two hosted options: Serverless and Hybrid.
Some data teams are free to offload all of their computation to a hosted service. For them, we offer Serverless Dagster Cloud.
Serverless Dagster Cloud frees the user from all operational burden, hosting all computation. Spinup in Serverless is totally effortless, and provides an unmatched, fully integrated experience. You just write a Python file and we do the rest: No Kubernetes, no Dockerfiles, no provisioning.
But not all teams can offload computation to a hosted service. For this we offer Hybrid Dagster Cloud. With Hybrid, we host all the stateful system infrastructure necessary to run Dagster’s control plane, and you bring your compute platform of choice. Dagster Cloud has no ingress into your infrastructure, does not see your code, and does not touch your data. You run a simple agent in your existing infrastructure, which in turn manages a stateless, elastic cluster that is easy to maintain. Hybrid Dagster Cloud maximizes flexibility and security while offloading the vast majority of operational burden to a managed service.
Empowering all stakeholders with a single pane of glass
Once code is written and deployed, the job is not done. Ongoing operations and observability are core to the developer experience of a data team.
The orchestrator is the beating heart of a data platform. It is the system that controls when computations run, what order they run in, and ensuring they are run successfully. When things go wrong, the orchestrator is where you go to debug, diagnose, and resolve issues.
We focused on making Dagster Cloud a best-in-class UI, accessible to all stakeholders in the data platform. This empowers both practitioners and stakeholders, keeping engineers responsible for infrastructure out of the critical path.
Dagster Cloud’s Run Timeline provides a full overview with real-time status. On a single screen, you can monitor all your data runs. This is the command center for your data platform operations, and you can quickly diagnose problems with surgical precision.
With Dagster Cloud’s Asset Graph you not only have full asset lineage and observability, but you can execute operations directly in the orchestrator, so you can tame the sprawling complexity of your data platform.
With Dagster’s Launchpad you can launch runs with a powerful but easy-to-use autocompleting editor usable by all stakeholders. You can kickoff runs as defined or easily experiment with their configurations and observe the result.
This barely scratches the surface of Dagster’s UI and operational capabilities. Debugging runs, managing backfills, monitoring schedules and sensors, and inspecting your assets are all done in easy-to-use, powerful UIs that are a joy to use.
Enterprise capabilities
In order for developers at companies of all sizes to get the benefits of Dagster Cloud, we also need support for the Enterprise. Dagster Cloud comes with an Enterprise account tier that provides the following: unlimited production deployments; built-in security along with single-sign-on capabilities (including Okta and AD support); more granular user roles for RBAC; robust service level agreements for those mission-critical implementations.
As Dagster comes into its own, so does its parent company Elementl, and we were proud to recently announce our SOC 2 Type 1 certification, another step in our commitment to support larger and more demanding customers.
Conclusion
We’re thrilled to open up Dagster Cloud to the world. It is not just a best-in-class orchestrator, but a revolutionary development platform for data teams.
One of the most gratifying pieces of feedback we have received from our early customers is that developing in Dagster Cloud is fun. We’ve never heard that about an orchestrator before. When engineers are having fun, that means they are productive and building useful things. This isn’t just a feel-good story: Happy developers translate into outsize business outcomes. That is the promise of Dagster Cloud.
You can sign up today here. To join our thriving, growing community, please join our Slack.
We're always happy to hear your feedback, so please reach out to us! If you have any questions, ask them in the Dagster community Slack (join here!) or start a Github discussion. If you run into any bugs, let us know with a Github issue. And if you're interested in working with us, check out our open roles!
Follow us:
AI's Long-Term Impact on Data Engineering Roles
- Name
- Fraser Marlow
- Handle
- @frasermarlow
10 Reasons Why No-Code Solutions Almost Always Fail
- Name
- TéJaun RiChard
- Handle
- @tejaun
5 Best Practices AI Engineers Should Learn From Data Engineering
- Name
- TéJaun RiChard
- Handle
- @tejaun