When PostHog's enterprise customers couldn't load their Web Analytics dashboards due to billions of monthly events, the team turned to Dagster and transformed weekend manual backfills into automated, reliable pipelines that power customer-facing features at scale.
KEY RESULTS:
- No more weekend firefighting: Eliminated manual backfill monitoring and pod restarts that kept engineers working weekends on customer-facing features.
- Faster incident response: Cut troubleshooting time from days of debugging to hours with built-in observability and Slack-to-run linking.
- Enterprise-scale performance: Enabled Web Analytics dashboards to serve billions of monthly events to largest customers without timeouts or failures.
- Organic team adoption: Grew from 6 to 20 engineers using Dagster across multiple product teams in just 3 months.
- Rapid feature delivery: Reduced typical feature development cycles from 1-2 days down to same-day or next-day shipping.
The Challenge
PostHog builds developer tools that help product engineers ship successful products. Their Web Analytics feature offers a streamlined dashboard with essential metrics like visitors, views, sessions, bounce rate, and conversions. It's designed for early-stage founders, engineers, and technical marketers who need something more focused than full product analytics.
But there was a critical problem. PostHog’s customers were ingesting billions of events per month, and their Web Analytics dashboards were failing under the load. Queries were timing out, and dashboards were returning errors instead of insights. PostHog's customers were feeling the pain.
"Very large customers, which operate at billion-scale monthly events ingestion, they just couldn't use their web analytics dashboard," explained Lucas Ricoy, an engineer on the Web Analytics team at PostHog. "Slow or failed queries were a major pain point and directly affected users' experience on Web Analytics."
This wasn't some backend quirk the engineering team could ignore. Paying customers couldn't access a product they were paying for.
The team had built their data processing workflows using Django management commands and lightweight orchestration. This worked fine at a smaller scale, but as PostHog's customers grew, the infrastructure couldn't keep up.
Backfills were particularly brutal. Lucas's manager, Robbie, would kick off a backfill job over the weekend and then spend his time monitoring a Kubernetes pod to make sure it didn't crash. If the pod died, the entire backfill died with it. There was no retry logic, no state management, just manual restarts and crossed fingers.
"We basically had to keep the state in our heads while running them," Lucas said. "We could have improved them, but for the level of complexity, they were just not enough anymore."
The team had other options for orchestration. They could use Celery for task scheduling or Temporal for workflow management. But neither was built for the kind of data-intensive processing that powers a product feature at scale. They needed something designed specifically for data pipelines, with partitioning, backfills, and asset-based orchestration built in.
The team knew they had to fix it. PostHog's products needed to continue to work at the scale it was promising customers.
The Solution
Dagster had already been recommended internally at PostHog for data processing workflows. James Greenhill, who leads the ClickHouse team, had been advocating for it. When Lucas evaluated options for powering Web Analytics at scale, Dagster stood out immediately.
"We could've gone with Celery or Temporal for Web Analytics, but Dagster had built-in features for data processing that made it extra helpful for our use case: Partition and backfill definitions, the Kubernetes operator, retry capabilities, the UI for launching ad-hoc materializations," Lucas explained.
The team built pre-aggregation pipelines that power the customer-facing Web Analytics dashboards. Instead of querying billions of raw events on every page load, Dagster orchestrates pipelines that process the data into specialized materialized views. When customers load their dashboards, they're querying pre-computed aggregates that return in milliseconds instead of timing out.
The heavy computation happens in ClickHouse, but Dagster orchestrates the entire workflow. The backfill policies were transformative. Lucas could define partition strategies based on customer data volumes (hourly for trillion-row customers, daily or monthly for smaller workloads) and let Dagster handle execution. If something failed, Dagster would retry automatically. If a pod crashed, a new one would spin up and continue from where it left off.
"Being able to define a backfill partition and just leave Dagster running is great for us," Lucas said. "Right now it's just a matter of configuring the dialogue and leaving it running."
The team set up Slack alerts for pipeline failures. When something breaks in a customer-facing feature, every second counts. Now they could click directly from Slack into the specific Dagster run, see the full stack trace, and understand what went wrong. The observability was miles ahead of their previous setup.
Testing was straightforward, too. Engineers could run pipelines locally with different configurations before deploying customer-facing changes. The Python-based approach meant they could use familiar tools and workflows to build product features, not just internal data pipelines.
The Results
The impact on the customer experience was immediate and dramatic.
Those enterprise customers with billions of monthly events went from seeing timeout errors to loading their dashboards almost instantly. The pre-aggregation pipelines handled production scale without breaking a sweat. Even with the feature still not enabled for every customer as it is being evaluated, the impact was substantial.
"It's almost instant because of the work we have done there," Lucas said. "It's really using top-notch technologies to build this orchestration and keep it working without having to play whack-a-mole every day."
The weekend backfill monitoring disappeared completely. There were no more manual pod restarts for customer-facing features, no more keeping the state in people's heads, and the orchestration powering the product just worked.
Product development velocity improved significantly. The team could ship new Web Analytics features and fix customer issues faster because they weren't constantly firefighting infrastructure problems. Tasks that previously took 1-2 days could now be done in hours.
"There is no frustration anymore on shipping that stuff," Lucas noted. "It's good enough. It's great for us that we're not having pain points on building those things anymore."
When teams need to ship new features quickly, they can turn them around in days or even hours. "Right now, I think we are like a day or two. We can stitch stuff together to test it internally in maybe some hours," Lucas explained. For an organization shipping customer-facing features with that level of data complexity and scale, that's remarkable velocity.
The business impact extended beyond just the Web Analytics team. When PostHog moved from Dagster open source to Dagster+ about two months ago, they discovered features that improved their ability to ship reliable products. The insights dashboard gave visibility into pipeline performance and costs. The deployment management made it easier to work across environments when shipping features.
Most telling was the organic growth in adoption. When Lucas joined PostHog eight months ago, only about six people were using Dagster. Three months later, that number had tripled to 16-20 people across multiple teams. Other product teams started using Dagster to power their features: revenue analytics, experiments, error tracking, and the Max AI assistant.
"It saves us so much time to not have to build orchestrations ourselves, and it's way simpler to write a Python file than build all of the Airflow internal details to make it run," Lucas explained. "So it's like no pain point. It really just works."
The team built additional product capabilities on top of the core orchestration. They created asset checks that automatically validate data accuracy for customer-facing metrics by comparing raw and pre-aggregated results, with a goal of less than 1% variance. They integrated with the open exchange rates API to provide currency conversion data to customers. They set up sensors to track execution metrics.
Perhaps most importantly, the team shifted from reactive firefighting to proactive product development. Instead of spending 30-40% of their time debugging infrastructure issues affecting customers, they could focus on building features and improving the product experience.
"Our product dashboards are now both faster and more reliable, and the data platform requires less manual oversight," Lucas said.
Looking Ahead
PostHog continues to expand its use of Dagster to power more product features across the organization. Teams are migrating existing Django management commands to Dagster-based workflows. The ClickHouse team uses Dagster heavily for operations that support multiple product features.
The experiments team is exploring moving more product workflows to Dagster. Lucas wants to expand the usage of asset checks for better data quality monitoring across customer-facing features.
"I would like to expand our usage of assets and asset checks for figuring out data quality," Lucas said. "For some things, I can see a lot of use cases we could use that for a better data quality product."
PostHog is also exploring AI-powered analytics features. They've started building Posthog AI, including an assistant that leverages their Dagster-orchestrated data layer to automate AI model evaluation comparisons.
As PostHog continues to scale and add more data-intensive product features, having reliable orchestration becomes even more critical. The foundation is solid, and the team is confident it can handle whatever comes next.
"Engineers appreciate the observability, local development ease, and clean abstractions," Lucas said. "It has replaced manual operations with reproducible, debuggable workflows."
Key Takeaways
- Dagster powers customer-facing product features, not just internal analytics, enabling PostHog to serve enterprise customers at scale
- Backfill policies eliminated manual monitoring and made large-scale data processing manageable for production workloads
- Product velocity improved dramatically: Feature development time dropped from 1-2 days to hours
- Reliability directly impacts customer experience: Fast, accurate dashboards powered by Dagster improve user satisfaction
- Developer experience enables product innovation: Local testing and Python-based development accelerate shipping new features
- Organic adoption signals product-market fit: Teams chose Dagster because it solved real product engineering problems




