Building shared spaces for data teams at Drizly

Published on 2021-03-15

Dennis Hume
Name
Dennis Hume
Handle

A shared space for non-SQL workflows

Drizly Logo

When we started rearchitecting our data infrastructure at Drizly a year and a half ago, we knew we wanted our data platform to be an application, supporting multiple stakeholders contributing and collaborating together in shared spaces. At the time we were a small team of data scientists and analysts who had to support the analytic needs of the entire company (from marketing to operations, partnerships and more).

Like many teams, we found that without a common data platform, there was a lot of duplicated data work being done, and getting that work to be consistent and reproducible was always a struggle.

For SQL-based workflows, the answer was relatively straightforward: we moved our data warehouse to Snowflake (from Redshift), and standardized our SQL transformations using dbt.

We want to give people across our organization the ability to iterate quickly on data projects in a safe environment.

But non-SQL workflows posed a tougher challenge. Here the different needs of the team became more apparent. Data scientists want the ability to develop machine learning models, with unique dependencies, that adhere to strict SLAs; while an analyst may want to add Python transformations to a table in the warehouse without having to worry about the underlying infrastructure.

We saw Dagster as a framework that could provide us with a new shared space to support all our data roles.

Fast spin-up

One of the things that we’ve found works really well with Snowflake and dbt is that everything is built around the idea of a SQL model that fits into an overarching DAG. This is helpful for members of the team, who can always think about how the model they’re working on fits into the DAG.

Dagster is built on the same principles, and we liked the way our team members could think of the work they were doing as a solid that fits into a pipeline.

But to take full advantage of Dagster, you also want to leverage additional abstractions beyond the DAG, like modes, presets, and resources. We wanted to make the power of these abstractions available to our users, but we didn’t want learning about Dagster concepts to be a barrier to getting work integrated.

So we spent a lot of time thinking about how to standardize our processes. Our goal was to make it possible for someone to spin up a Dagster pipeline in a single PR during their first week on the job.

Standardized deployments for the entire development lifecycle

Before Dagster, deploying new data pipelines involved a lot of effort and risk. Usually, someone would build something on their local machine, we would configure the bare minimum AWS resources to support it, and we would throw it into production to test. This led to drawn out deployments with numerous back and forth conversations to resolve issues. As a result, getting new data workflows into production was a burden and making changes to existing pipelines was generally avoided.

Instead we wanted to frontload testing and invest a lot in our workflow to ensure a smooth transition through the development lifecycle and give users the tools they needed for their pipelines from the beginning.

One reason our Snowflake and dbt environment are effective is that every user can experiment with their models safely within a cloned database, run unit tests, and know that their work will be reflected in production when it’s merged. Our goal was to get to that same level of confidence with Dagster.

We decided on 4 key deployments within our Dagster project, each of which is tied to a stage in the development lifecycle and a deployment target for running Dagster, whether on a local machine or on AWS:

DeploymentLocationErrors CaughtFeedback
localLocal, Python virtual environmentEnsure pipelines compile and check logic of solids against locally-stored, pre-configured data< 30 seconds
docker composeLocal, DockerEnsure that the pipeline fits into the overall Dagster application and that all images are created correctly. Also test any schedules or sensors.1 min
devAWS dev accountEnsure integration testing and that all AWS permissions are configured correctly5-10 mins
prodAWS prod account 5-10 mins

These deployments allow our team to take advantage of Dagster features without necessarily knowing all the details of the abstraction layer.

We use environment variables to filter the Dagster objects (modes, presets, resources, and schedules) that are available in each deployment. This makes sure that users have the facilities they need at each stage of their dev work, while preventing confusion (trying to launch a pipeline that might not have the supporting infrastructure present) and adding safety (not having someone accidentally launch a prod pipeline from their machine).

For example, in our local deployment, users only have access to local mode. This means they can only run their pipeline with mocks and no external dependencies. This is the perfect environment to ensure the pipeline is compiling, to test the logic of solids on small, pre-configured data sets, or to execute unit tests.

This process lets users take a pipeline from their local machine to production without worrying about infrastructure details.

When a user moves past this stage to deploy using docker-compose, they will have access to both local and dev mode. Now they can run their pipelines with resources configured to use Docker services within the compose environment or connections to sandbox environments.

Finally, when a user is ready to put out a PR, our CI runs final integration tests on our Dagster project in our AWS dev and prod accounts.

This process lets users take a pipeline from their local machine to production without worrying about infrastructure details. Whether they are setting up scheduled jobs or creating an event driven workflow with sensors, all configuration details can be maintained by the user, in Python, alongside their pipeline code.

A pull request passing through our dev mode in CI

Light-weight infrastructure for local development

Designing our workflow around different deployments also allows us to use different Dagster instance configurations for different stages of development. This means users don’t need to set up a lot of local infrastructure in order to run tests; the same code will run in production with production-appropriate scheduling and execution.

DeploymentModesLauncherWorkspacesDescription
locallocalDefaultSingleRunning a single Dagit locally on a single workspace. Running unit tests.
docker composelocal, devDockerAllRunning our entire Dagster project (all workspaces) as well as the daemon and any supporting infrastructure
devlocal, devAWS ECSAllOur Dagster AWS stack on our dev AWS account
prodprodAWS ECSAllProduction Dagster

The local deployment uses the default run launcher, while the docker compose deployment uses the DockerRunLauncher to isolate runs in their own containers.

Dagster running in docker-compose locally

Our two AWS deployments (on our dev and prod) accounts both take advantage of a Dagster ECS Launcher (currently a custom launcher).

Our deployments in compose, dev and prod also include the Dagster daemon to run our schedules. This means users can define the logic of their own schedules in code and test them without having to think of the relationship between Dagit and the daemon.

This flexibility was a big selling point for Dagster, since our team can benefit from all of the Dagster features without ever touching the dagster.yaml or workspace.yaml files (or even knowing that these files exist!).

Enabling self-service on the data platform

This separation of concerns means that a small core data infrastructure team can efficiently ship a data platform for users with different skillsets. We use a set of cookie-cutter templates to ensure that any new pipelines are set up with the correct modes, presets, and schedules for our deployment environments.

Dagster and the structure of our deployments are getting us to the point where everyone on the team can feel comfortable trying things out.

Now, anyone can contribute an additional pipeline, without any real coordination overhead to ensure that it will behave like everything else we’ve already stood up.

Before we started this work, people on the team would say “I want to try and do X but I don’t want to break anything.” As a data engineer, it always warmed my heart to hear people worry about the data stack. But as we continue to grow as an organization, we wanted there to be as little unnecessary coordination as possible preventing people from doing their jobs.

We want to give people across our organization the ability to iterate quickly on data projects in a safe environment. Dagster and the structure of our deployments are getting us to the point where everyone on the team can feel comfortable trying things out.

Next steps

We are still getting up to speed with everything Dagster can do, and we’re excited about streamlining more of our processes. For example, we have a number of isolated workflows running in one-off environments (like AWS Step Functions) that would be better consolidated on Dagster. We also have microservices that might be better expressed as sensors within the Dagster system. We’d like to use the Dagit UI as a unified interface for these functions, some of which lie outside of the sphere of analytics proper.

If you’re interested in helping us build out the data team at Drizly, we’re hiring. And if you want to hear more about our data stack, check out our talk from Coalesce last year.