Discover how the Tampa Bay Rays transformed their competitive edge by processing MLB game data 50-70% faster with Dagster, delivering critical analytics within 15 minutes of game completion instead of waiting until morning.
The challenge
Just like the Oakland Athletics in Moneyball transformed baseball by finding hidden value in overlooked statistics, the Tampa Bay Rays have built their competitive advantage on superior data analytics. "I always tell everybody, they're like what do you do? And I ask, have you seen the movie Moneyball? Oh yeah. I do that. That's usually the easiest way to explain it," says Brian Plexico, Vice President of Baseball Systems.
The Rays, known for their strategic approach to resource allocation, have built their reputation on analytics-driven excellence. "Our edge is trying to get more from less," Plexico explains. "And we feel that analytics are one opportunity to do that."
But the scale of modern baseball data has exploded beyond what their legacy system could handle. "Back when I first started, we'd get 75 CSVs a day from Major League Baseball. That was the extent of it," Plexico recalls. "Now with the player tracking, ball tracking, and other sources, the data has exploded over the last 10 years or so." Today, every professional baseball game generates millions of data points: Statcast tracking systems capture player positions as coordinates, bat speed, ball trajectory, and even data from both ends of the bat to determine swing path and velocity.
Their previous system couldn't keep pace with these demands, creating mounting challenges:
- Schedule dependency hell: Rigid timing meant one delayed API response could cascade into hours of failures
- Data source chaos: 20+ providers with different formats, from MLB's Statcast to proprietary tracking systems
- Debugging nightmares: "With our current system, there's a lot more digging... you're bouncing around a lot more"
- Real-time expectations: Stakeholders need insights as soon as possible after games end, not after morning batch processes
The team needed a complete rethinking of how they processed, validated, and delivered data to decision-makers.
KEY RESULTS:
- 50-70% faster pipeline execution: Processing reduced from hours to minutes
- 15-minute data availability: Game data ready within 10 - 15 minutes vs 9 a.m. batch jobs
- One-week onboarding: New data sources integrated 2-3x faster than the previous system
- Zero-touch reliability: Critical pipelines run unattended nightly without intervention
Why the Tampa Bay Rays chose Dagster
After six months of evaluating "many different platforms" and another six months testing Dagster's open source version, Plexico and his team made a decisive choice. While "many other teams in baseball use orchestration tools that are not Dagster," the Rays deliberately went a different direction. Several key factors convinced them that Dagster was the right fit for their unique challenges.
Asset-centric thinking aligned with baseball data
"Most orchestration tools are just A, B, C, D, and there was no concept of an asset in the same way that there is in Dagster," Plexico explains. "It was a huge selling point to me that once a game ends, I can pull the data for that one game because I've got it partitioned by a game ID." This asset-based approach naturally maps to how the Rays think about their data: games, players, innings, pitches, not abstract tasks.
Developer-first philosophy
Coming from a Ruby on Rails background, Plexico appreciated Dagster's opinionated yet flexible approach. "I'm a developer by trade first, more than a data engineer. I brought a lot of my software experience into this project, and Dagster fit in that realm a lot better for me... it takes a lot of choices away from you. You just do it the Rails way, and Dagster has a lot of similarities."
Integrated debugging and observability
The team needed to quickly identify and resolve failures without extensive log investigation. "With Dagster, it's all right there in the logs because of the way the assets are laid out. It's much more obvious given the asset graph and run logs where things might have gone wrong. It's just a lot faster to get to the debugging part."
Cost-effective cloud solution
The Rays evaluated Dagster+ as equivalent to hiring another data engineer. "For us, it came down to what benefits Dagster+ provided us and it was a no-brainer for the cost," especially crucial for a team that prides itself on doing more with less.
Community and support ecosystem
Beyond the technology, the active community sealed the deal. Plexico became one of the first to complete Dagster University and is a frequent user of the Slack community's AI bot. "The customer support is a huge benefit for us."
The implementation
The Rays took a pragmatic approach to migration, implementing what they called a "one-to-one replacement" strategy. Rather than attempting a risky wholesale migration, they rebuilt critical pipelines in Dagster one at a time while maintaining operations on their legacy system. "The things we've changed, they haven't even noticed, which is awesome. It just works," Plexico notes. This strategy allowed them to prove value incrementally while ensuring zero disruption to the organization’s daily analytics needs.
Their new architecture leverages Dagster's sensor capabilities to create truly event-driven pipelines. When a game ends, sensors detect the completion and immediately trigger data ingestion rather than waiting for scheduled batch jobs. "An East Coast game goes final at 10:30 p.m., the game's marked completed by 10:55. We use a sensor and determine that the game is completed, and at that point, we can start kicking everything off."
The team integrated several key tools within their Dagster ecosystem:
- Sling integration for data movement, which Plexico calls his "favorite" feature for wrapping their existing BCP processes
- Slack integration for alerts and notifications, providing one-click access to pipeline debugging
- SQL Server integration maintaining their existing data warehouse while modernizing orchestration
A typical game pipeline processes multiple data layers: box scores and lineups arrive first, followed by play-by-play data, then granular pitch-level analytics, including Statcast tracking data that provides coordinate-level player positions and ball physics. "We get all the data and split it out because it's used for different things. We've got people that are interested in pitching, we've got people that are interested in hitting, plenty of other things we're interested in."
The results
The Rays transformed how a small-market team can compete through data. "Even 50% would probably be low-balling it," Plexico notes when describing their performance improvements.
Their game-changing approach centers on Dagster's asset-based orchestration, transforming how they think about baseball data:
- Event-driven intelligence replaces rigid schedules: "Instead of waiting till 9 a.m. to kick off this job to get every game that got played the night before, I can say, oh, this one game ended, gimme that game." Using Dagster sensors, the Rays now process individual games within 10-15 minutes of completion.
- From hours to minutes: West Coast games that end at 1 a.m. Eastern are fully processed and ready for analysis by 1:15 a.m., not 9 a.m. Stakeholders wake up to fresh insights, not stale data.
- Reliability that just works: "Something as simple as the current standings in the morning after the previous night’s games: We've written a pipeline. We drop it in. And we basically have to do nothing more than parse the JSON and it goes into the system." The result? No more frantic morning calls about missing wins in the standings.
The business impact has been substantial. With faster, more reliable data pipelines, the analytics team can focus on developing new insights rather than maintaining infrastructure. "We can move on to other things. It just runs. And that's what we were hoping for when we signed up," Plexico explains. This operational efficiency directly supports the team's ability to make data-driven decisions from the front office to the dugout, critical advantages for a team competing against opponents with significantly larger budgets.
Looking ahead
The Rays have ambitious plans for expanding their Dagster implementation over the next year. "I'm excited about asset checks," Plexico explains. "The idea of catching some of these data issues as they come in is gonna be huge for us."
Their immediate priorities include:
- Intelligent orchestration over rigid scheduling:
Moving from time-based triggers to pure event-driven workflows where "one asset triggers the next," enabling machine learning models to run the moment fresh data arrives rather than waiting for scheduled windows. - Self-service analytics democratization:
Opening the Dagster UI to the broader organization to answer the two questions that plague every data team: "Has this run yet?" and "Where does this data come from?" As Plexico notes, "The ability for everybody to just go in and see the live view... I think the power of that is unmatched." - Branch deployments for safer iteration:
As more team members contribute code, branch deployments will enable testing without risking production pipelines that process data for crucial morning meetings. - Ongoing Asset checks for data quality:
Implementing automated validation to catch issues from unreliable third-party sources before bad data reaches production systems, particularly important given the variable quality of external data feeds.
Key takeaways
- Cost-effective scaling achieved. "What Dagster+ provided us was a no-brainer for the cost", equivalent to hiring another data engineer but delivering far more value
- Developer velocity unleashed. New data sources onboarded in one week versus "two or three times as long" with the legacy system
- Team adoption without friction. Four data engineers are actively developing, with non-pure developers successfully contributing thanks to Dagster's intuitive Python approach
- Moneyball 2.0 enabled. Processing everything from basic box scores to Statcast tracking data, enabling data-driven decisions that help a small-market team compete with larger organizations
- Sleep-friendly operations. "We don't have to think about it every night... that's probably been the highlight for me so far."
- Community-powered success. Heavy use of Dagster's Slack community and AI assistant, with Plexico being one of the first to complete Dagster University




