Blog
How Compass Turns Questions Into Queries

How Compass Turns Questions Into Queries

October 23, 2025
How Compass Turns Questions Into Queries
How Compass Turns Questions Into Queries

Go behind the scenes of Compass, Dagster’s analyst copilot, to see how it transforms plain-language questions into precise, optimized SQL queries. Learn how each step of the query-generation process helps analysts move faster and stay focused on insights.

Data professionals rely heavily on query planners, the unsung heroes of every database system. A query planner takes a high-level declarative query (like SQL) and figures out the most efficient way to execute it against the physical database. That process involves multiple layers of parsing, optimization, and execution.

Compass follows a similar principle. But instead of starting with a structured declarative query, Compass begins with something much more flexible—and much more ambiguous: natural language. The end goal is still the same, a well-formed query (or series of queries) against your database, but there are several additional steps needed to bridge the gap between human language and SQL execution.

This post breaks down the lifecycle of a query inside Compass. We’ll use an example question: “Show me Q3 sales by region?” and follow how Compass interprets, structures, and executes it.

1. Context Search

Large Language Models (LLMs) are powerful, but they can’t know everything. Their accuracy is bounded by the context they’re given. Your data platform isn’t just tables and schemas, it also embodies years of institutional knowledge: naming conventions, business logic, edge cases, and custom terminology. An uninformed LLM wouldn’t know that your “net sales” metric excludes refunds or that you always use billing_address_state rather than shipping_address.

That’s why Compass emphasizes context from the very beginning. Each instance of Compass is unique to your organization. Over time, you can shape its understanding by providing definitions, feedback, and guidance.

When you ask a question, Compass extracts key concepts and searches its context store for relevant definitions, documentation, or prior clarifications. This grounding ensures that the SQL it generates reflects your business reality, not just the literal words in your query.

2. Dataset Discovery

In SQL, you must be explicit: “SELECT * FROM table_name”. With natural language, Compass has to infer what the right dataset is before it can even begin constructing SQL.

To do this, Compass starts broad. It scans all available datasets and looks at their names, descriptions, metadata, and business keywords. Then, it narrows down to the most likely candidates.

Again, context matters. If your company has multiple tables tracking “sales,” Compass will rely on your guidance and prior feedback to prioritize the right one. The more you refine Compass’s understanding of where your trusted data lives, the better its results will be.

3. Schema Retrieval

Once Compass identifies candidate datasets, it dives deeper. This step involves retrieving schema information, columns, data types, constraints, as well as any additional metadata tied to the tables. With this richer picture, Compass can determine which fields are most relevant to your request.

For example, if you asked for “sales by region,” Compass might confirm that region_name is the appropriate column and that sale_amount should be aggregated because those associations are captured in the schema and context store.

4. SQL Generation

Now the pieces come together. Compass uses an agent to synthesize multiple inputs:

  • The natural language of your original request
  • Business context retrieved from the context store
  • Schema details and metadata
  • The SQL dialect of your underlying system (Snowflake, BigQuery, Redshift, etc.)
  • Prompt guidance for query style and format

This synthesis produces a SQL statement that is both syntactically valid and semantically aligned with your organization’s business rules. In other words, it’s not just “any query that runs”, it’s the right query.

5. Query Execution

With SQL in hand, Compass connects to your data warehouse using appropriate drivers. At this stage, we enter familiar territory. The database itself is responsible for optimizing and executing the query.

Compass manages execution by handling things like result size limits, timeouts, and retries. Once the warehouse finishes, Compass retrieves the results safely, truncating or sampling if necessary to avoid overwhelming the user with excessive data.

6. Result Processing

Finally, Compass turns raw query output into insight. The agent analyzes the results, summarizes key findings, and, when appropriate, generates visualizations. Crucially, Compass doesn’t just give you an answer, it provides the methodology behind it. This transparency allows you to validate the logic and build trust in the output.

Compass also suggests follow-up questions, enabling iterative exploration. If your initial query was “sales by region,” Compass might prompt you to compare regions over time, or to drill into the top-performing states. This guided discovery is part of what makes Compass more than just a query generator.

7. Feedback Loop

These are the high level steps that Compass navigates to generate queries. You can get the details of each step by clicking on the “See all steps” button within the thread. Within there you may also see Compass do certain steps multiple times as it refines its search. It may also execute multiple queries to ensure better results.

And asking Compass a question is the beginning of a conversation. You can request followups, ask for more clarification, or store additional context as you explore.

How to Think About Your Compass Queries

Much like you don’t think about every internal step of a database query planner, you don’t need to micromanage Compass. But it helps to understand the sophistication at work behind each answer.

Your Compass instance is tailored to your organization. The richer the context you provide, the sharper and more accurate your results will be. By grounding every query in your institutional knowledge, Compass ensures that you’re not just querying data, you’re asking questions in the language of your business and getting the best answers possible.

Have feedback or questions? Start a discussion in Slack or Github.

Interested in working with us? View our open roles.

Want more content like this? Follow us on LinkedIn.

Dagster Newsletter

Get updates delivered to your inbox

Latest writings

The latest news, technologies, and resources from our team.

Multi-Tenancy for Modern Data Platforms
Webinar

April 7, 2026

Multi-Tenancy for Modern Data Platforms

Learn the patterns, trade-offs, and production-tested strategies for building multi-tenant data platforms with Dagster.

Deep Dive: Building a Cross-Workspace Control Plane for Databricks
Webinar

March 24, 2026

Deep Dive: Building a Cross-Workspace Control Plane for Databricks

Learn how to build a cross-workspace control plane for Databricks using Dagster — connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.

Dagster Running Dagster: How We Use Compass for AI Analytics
Webinar

February 17, 2026

Dagster Running Dagster: How We Use Compass for AI Analytics

In this Deep Dive, we're joined by Dagster Analytics Lead Anil Maharjan, who demonstrates how our internal team utilizes Compass to drive AI-driven analysis throughout the company.

Making Dagster Easier to Contribute to in an AI-Driven World
Making Dagster Easier to Contribute to in an AI-Driven World
Blog

April 1, 2026

Making Dagster Easier to Contribute to in an AI-Driven World

AI has made contributing to open source easier but reviewing contributions is still hard. At Dagster, we’re improving the contributor experience with smarter review tooling, clearer guidelines, and a focus on contributions that are easier to evaluate, merge, and maintain.

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform
Blog

March 17, 2026

DataOps with Dagster: A Practical Guide to Building a Reliable Data Platform

DataOps is about building a system that provides visibility into what's happening and control over how it behaves

Unlocking the Full Value of Your Databricks
Unlocking the Full Value of Your Databricks
Blog

March 12, 2026

Unlocking the Full Value of Your Databricks

Standardizing on Databricks is a smart strategic move, but consolidation alone does not create a working operating model across teams, tools, and downstream systems. By pairing Databricks and Unity Catalog with Dagster, enterprises can add the coordination layer needed for dependency visibility, end-to-end lineage, and faster, more confident delivery at scale.

How Magenta Telekom Built the Unsinkable Data Platform
Case study

February 25, 2026

How Magenta Telekom Built the Unsinkable Data Platform

Magenta Telekom rebuilt its data infrastructure from the ground up with Dagster, cutting developer onboarding from months to a single day and eliminating the shadow IT and manual workflows that had long slowed the business down.

Scaling FinTech: How smava achieved zero downtime with Dagster
Case study

November 25, 2025

Scaling FinTech: How smava achieved zero downtime with Dagster

smava achieved zero downtime and automated the generation of over 1,000 dbt models by migrating to Dagster's, eliminating maintenance overhead and reducing developer onboarding from weeks to 15 minutes.

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster
Case study

November 18, 2025

Zero Incidents, Maximum Velocity: How HIVED achieved 99.9% pipeline reliability with Dagster

UK logistics company HIVED achieved 99.9% pipeline reliability with zero data incidents over three years by replacing cron-based workflows with Dagster's unified orchestration platform.

Modernize Your Data Platform for the Age of AI
Guide

January 15, 2026

Modernize Your Data Platform for the Age of AI

While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.

Download the eBook on how to scale data teams
Guide

November 5, 2025

Download the eBook on how to scale data teams

From a solo data practitioner to an enterprise-wide platform, learn how to build systems that scale with clarity, reliability, and confidence.

Download the e-book primer on how to build data platforms
Guide

February 21, 2025

Download the e-book primer on how to build data platforms

Learn the fundamental concepts to build a data platform in your organization; covering common design patterns for data ingestion and transformation, data modeling strategies, and data quality tips.