- Name
- Pete Hunt
- Handle
- @floydophone
I have a big announcement: I’ve joined Elementl to lead engineering.
At first glance it might seem odd to some for me to make the leap from Twitter, a large consumer-facing tech company, to Elementl, an up-and-coming data tools startup. But in many ways this company was the obvious next step for me.
The problem we’re solving
In 2014, I co-founded Smyte, which later sold to Twitter. Smyte is a large-scale stream processing system that identifies spam and abuse in real-time using a combination of manually curated heuristics, supervised and unsupervised machine learning, and large-scale manual review.
Our data scientists at Smyte would often prototype new signals using a combination of SQL, Python scripts, and cron jobs. They’d either put that code directly into production — often with minimal or no tests and monitoring — or ask our infrastructure team to productionize it, which disrupted the focus of that team. Once in production, it was often hard to understand how these different jobs interacted with each other, which made the system harder to maintain and evolve over time.
Post-acquisition, this was an even bigger challenge at Twitter. Some teams were adopting Airflow or trying out dbt, but there was still an ever-increasing library of Python scripts and cron jobs to create reports, ML models, and aggregate tables; balancing production readiness with iteration speed was a constant struggle. I was hearing from engineers that Airflow was cumbersome to use and that dbt wasn’t powerful enough to tackle most of our problems.
Upon asking around, I realized that this wasn’t a problem specific to Smyte or Twitter. Data engineering is an increasingly important discipline whether you’re a trendy Silicon Valley hypergrowth company, or a century-old food processing company in the middle of the country. Everyone I talked to seemed to have a similar story of a rat's nest of cron jobs and Python scripts. I knew there had to be a better way to solve this.
The product
“Isn’t Dagster just a better Airflow? How much better does it have to be in order to be successful?”
This is a question that I needed to answer before making the leap to Elementl.
A long time ago, I was a founding member of the React.js team at Facebook. As I started to learn more about Dagster, I saw many parallels between Dagster in 2022 and React in 2013.
Back in 2013, React was competing with technologies such as AngularJS 1.0 and jQuery. While there were many independent factors that contributed to React’s success over these technologies, three key characteristics really drove its adoption. When I realized that these three characteristics were shared by Dagster, I became really excited to join the team.
- Fundamentals. React and Dagster get the fundamentals right. Unlike incumbents at the time, React was designed from the ground up to support fast, reliable testing and static type checking. Dagster, too, was designed for the ground up to support these important capabilities that expert engineers expect. This could also be called “second-mover advantage”; both projects were able to learn from the projects that came before them and bake in improvements at an early stage.
- The programming model. React and Dagster fundamentally change the programming model. React’s programming model was revolutionary when it came out: declaratively model your UI as a pure function with the full power of a real programming language (JavaScript), rather than a crippled templating language. Dagster’s Software-Defined Assets allow you to model your data asset as a pure function with the full power of a real programming language (Python).
- Integrations. React and Dagster both prioritize integration with other systems. In the early days of React it was extremely important to support brownfield projects, where most of the existing code was written in another JS framework or server-side endpoint. It was easy to embed React components in existing apps, and easy to embed existing apps inside React components. Dagster, too, prioritizes integrations with existing tools in the ecosystem. For example, just like how React can model jQuery components, Dagster can natively model and orchestrate assets produced by other tools in the modern data stack, like Airbyte, Fivetran, dbt, and others.
For these reasons, I became convinced that Dagster is moving data engineering in the right direction, and I saw a path to its widespread adoption that could follow React’s.
The team
Every great company has a great team behind it.
My career started in 2011 when I joined Facebook straight out of grad school. During that time I had the privilege of working with some of the best engineers in my career. Not only was Elementl founded by one of those engineers—Nick Schrock—they had also managed to hire many of my talented colleagues from that era. This immediately got my attention when I was considering what to do next.
Since arriving at the company, I’ve found that the team has also managed to hire a number of incredibly talented people outside of the Facebook universe. I’m proud to say that I’ve joined one of the best teams I’ve worked with in my career.
I’ve joined one of the best teams I’ve worked with in my career.
The timing
Finally, it’s a really exciting time to be at Elementl. We have a rapidly growing community, a production-ready, paradigm-shifting OSS project, and we’re getting ready to launch our first commercial offering. I couldn’t be more excited about getting to work on this problem with these fine people.
If you’re interested in joining us, shoot me a note at pete@elementl.com!
We're always happy to hear your feedback, so please reach out to us! If you have any questions, ask them in the Dagster community Slack (join here!) or start a Github discussion. If you run into any bugs, let us know with a Github issue. And if you're interested in working with us, check out our open roles!
Follow us:
AI's Long-Term Impact on Data Engineering Roles
- Name
- Fraser Marlow
- Handle
- @frasermarlow
10 Reasons Why No-Code Solutions Almost Always Fail
- Name
- TéJaun RiChard
- Handle
- @tejaun
5 Best Practices AI Engineers Should Learn From Data Engineering
- Name
- TéJaun RiChard
- Handle
- @tejaun