Clippd is a unique golf performance tracking platform that uses artificial intelligence, machine learning, and advanced data science to create a golfer's "unique golfing DNA." By analyzing patterns across all activities, Clippd contextualizes performance based on factors like weather conditions, course difficulty, and historical performance for individual golfers and 200+ college golf programs. Unfortunately, Chief Data Officer Chris Robertson and his team had inherited an expensive, inflexible, and (worst of all) highly manual data system that was impeding the company’s ability to keep up with growing customer demand for Clippd’s data-intensive services.
Key Results
- Data availability delays: From hours -> zero
- Manual pipeline labor: 8+ hours - a full work day - each week -> zero
- Rapid scale: From frozen growth -> comprehensive ELT pipeline supporting analytics for 200+ college golf programs and multiple downstream use cases
- Democratized data access: Black-box system -> accessible data with organization-wide visibility
Strangled by manual bottlenecks
Someone had to go in manually, start up the instance, do all the processing in the morning, which meant that you don't have data available for the first few hours of the day.
Beyond a forced wait for fresh data, this daily routine was also costing Clippd’s technical team approximately 8 hours per week in manual data operations, eventually driving them to a critical inflection point. Users across the company demanded data for insights that could grow the business, but legacy infrastructure blocked Clippd from becoming the data-centric organization it needed to be.
The company had inherited a setup from early consulting work, along with significant operational friction that threatened Clippd’s ability to scale. The problems were multifaceted and interconnected:
- Manual daily operations: Every morning, someone had to manually start up instances and process data, creating hours of delay before fresh data was available.
- Limited accessibility: Only one or two people could handle data operations, creating bottlenecks that prevented broader organizational engagement with data.
- Lack of native integrations: Chris says it could be hard to get ingestion working correctly because the inherited data stack lacked proper integrations for their other tooling: “We had to fudge some things a little bit to get it to work.”
- Scalability blockers: Processing larger datasets required creative workarounds on single machines, hitting walls that threatened their ability to support their growing customer base.
- Expensive licensing model: Due to the cost structure of the inherited toolset cost structure, access was restricted to a small number of users. “The full team didn't really have access. They were using the data but didn't really know what was going on under the hood in the data pipeline,” explains Chris. “We were living with quite low observability over the pipelines and processing.”
- Black-box operations: The lack of visibility meant most team members couldn't understand or troubleshoot data processes, slowing things down even further.
We needed to move to a more robust system – something a bit more grown-up,” Chris says. But the team realized they needed more than just a technical upgrade. They needed to build a data system platform that could transform Clippd into a truly data-driven organization.
Why Clippd chose Dagster
Chris quickly settled on dbt as Clippd’s transformation layer (but ruled out dbt Cloud because there was no integration with ClickHouse, Clippd’s analytics database). After that, he says, “It was basically, how do we decide on the orchestration layer and the ingestion parts of this ELT pipeline?”
After evaluating multiple orchestration options, including Airflow and Prefect, Clippd chose Dagster for its data-centric approach that aligned perfectly with their analytics-focused use case. “I wasn’t unhappy with the other options, but it was clear that Dagster just fit our use case much better,” Chris says. The decision came down to several key factors:
- Data-centric orchestration: Unlike task-focused tools, Dagster's asset-based approach matched how Clippd thought about its analytics data pipelines.
- Native dbt integration: Dagster's ability to work at the model level rather than packaging everything into large dbt jobs gave them the granular control they needed.
- Modern developer experience: The platform offered clean development workflows and local testing capabilities that the team valued.
- Unified control plane: The ability to provide comprehensive visibility across their entire data ecosystem was essential for breaking down the black box pipelines that the team suffered under their inherited toolset setup.
- Scalability without complexity: Dagster could grow with their needs while still maintaining simplicity for their lean team.
Dagster’s native dbt support was particularly compelling. "The native dbt support, where you can go down to the model level, was really big for us,” Chris explains, “Because with Airflow you have to package things up into big dbt jobs to run as Airflow DAGs. That wasn't really something we wanted to do."
The team appreciated that Dagster could serve as both an immediate solution to their analytics pipeline needs and a foundation for more sophisticated use cases in the future.
Building a modern ELT foundation with Dagster
Clippd implemented a comprehensive modern data stack with Dagster as the orchestration layer at its core. The architecture brought together best-in-class tools while maintaining the simplicity and reliability the team needed:
- Ingestion layer: irbyte for connecting to multiple sources ranging from processed golf performance data, product analytics platforms, accounting systems, and other enterprise data sources to consolidate information into our ClickHouse analytics cluster.
- Orchestration: Dagster manages the entire pipeline lifecycle with comprehensive observability.
- Transformation: dbt models integrated natively with Dagster for granular control and monitoring.
- Storage: ClickHouse analytics database optimized for their analytical workloads.
- Visualization: Tableau and Metabase for downstream analytics and reporting.
The migration strategy was both swift and thorough. Rather than attempting a big-bang replacement, the team ran both systems in parallel for a few weeks while focusing on getting their essential reporting requirements operational in Dagster as quickly as possible. This approach allowed them to maintain business continuity while building confidence in the new platform.
Once the core functionality was stable, they continued expanding their capabilities far beyond what they had previously achieved. The team took advantage of Dagster's flexibility to implement more sophisticated monitoring, alerting, and data quality checks that simply were not possible in their previous setup.
Ultimately, Clippd’s new data stack leveraged Dagster's modern architecture principles. It features automated scheduling, comprehensive logging, and Slack-integrated alerting that keep the team informed of pipeline status without overwhelming them with noise.
Even though I'm not hands-on every day, the fact that I can dip into the Dagster UI to see what's going on, see how things look, and then if I want to look into something, I can just get into the code base as well — that’s really massive. No more data-goes-in-and-comes-back-out black box.
Transforming your data and your organization at the same time
The transformation has fundamentally changed how Clippd operates as a data-driven organization. Beyond the immediate operational benefits, Dagster has enabled a cultural shift where data accessibility and reliability are no longer barriers to innovation.
Operational excellence
- Eliminated manual operations: Pipelines now run automatically at 4 AM, ensuring fresh data is always available when stakeholders log in first thing in the morning
- Freed up significant time: The team saves approximately 8+ hours per week previously spent on manual operations, equivalent to a full day of work
- Improved reliability: Automated monitoring and Slack alerts provide proactive issue detection instead of waiting for dashboards to break
- Better resource allocation: Time previously spent on manual operations is now invested in extracting insights and building data products
Platform democratization
- Organization-wide visibility: The entire team can now engage with their data platform, not just one or two specialists
- End-to-end transparency: Team members can examine the UI, understand pipeline status, and dive into the codebase when needed
- Faster iteration cycles: New models can be developed in dbt, integrated with Dagster, and deployed to production quickly
- Increased stakeholder engagement: The platform's reliability and the team's increased agility have driven adoption across the business
Cultural transformation
- Data-driven decision making: The organization has evolved from a company that happened to have data to one where data is central to operations and competitive strategy.
- Positive feedback loops: As stakeholders see how quickly the team can deliver new analytics, they become more invested in data-driven approaches.
- Improved trust: Consistent, reliable data has built confidence across the organization that insights are accurate and available when needed.
The platform has become a catalyst for broader organizational change, Chris says. “Now other areas of the business are actively seeking to lean on the data more for doing more analytics-driven approaches to their work.”
We're able to build new models in dbt, have them show up in Dagster, and move very quickly to actually deploying that in production.
Advanced analytics and machine learning
With their foundational data platform solidly established, Clippd is positioned to tackle more sophisticated use cases that were previously out of reach. The team plans to expand their medallion architecture with bronze, silver, and gold data layers to support advanced analytics and machine learning initiatives around customer lifetime value and churn modeling.
The reliable foundation Dagster provides gives them confidence to pursue these advanced capabilities. “I'm pretty comfortable with how we're able to report on the things that exist within the business,” Chris says. “Now we can lean into that maturity curve to think about how we explain what’s happening now, so we can start to predict what's going to happen in the future?"
The company is also exploring additional data quality implementations, potentially integrating tools like Great Expectations through Dagster's native support, and expanding their automated pipeline coverage to include more intermittent data science workloads that will support their ever-growing customer base of college golf programs and, of course, individual duffers everywhere.