The central nervous system — intent classification, source routing, prompt building, and validation in one governed service.
Instead of letting the VS Code extension directly talk to Snowflake and Power BI, insert a small API service between them. This gives you one enforcement point instead of pushing security logic into the extension.
VS Code Chat Participant
│
▼
Orchestration API ← auth, validation, routing, caching, audit
│ │
▼ ▼
Power BI Snowflake
The orchestration API handles:
Exchange short-lived user tokens for service credentials. No raw secrets ever reach the extension.
Validate every generated query against a policy engine. Block dangerous operations before they execute.
Log every request — who asked, what was generated, what ran, what was returned. Full traceability.
Cache metadata and frequent query results. Reduce latency and API call costs.
Every user question goes through four sequential stages:
1. CLASSIFY → Determine intent type and extract entities 2. ROUTE → Pick data source (Power BI or Snowflake) 3. BUILD → Construct prompt with metadata + validate output 4. EXECUTE → Run query, format response, return to user
Stages are sequential and gated — if validation fails at stage 3, the query never reaches stage 4. The pipeline short-circuits with an explanation of what went wrong.
The orchestration service classifies user questions into one of several intent types:
| Intent | Description | Preferred Source |
|---|---|---|
kpi_lookup | Single KPI value (e.g., "What is current PMPM?") | Power BI |
kpi_trend | KPI over time (e.g., "Show PMPM trend by month") | Power BI |
comparison | Compare segments (e.g., "PMPM by region") | Power BI |
exploration | Ad-hoc detail query (e.g., "List denied claims in Q3") | Snowflake |
report | Generate/embed a full report | Power BI |
export | Export data to CSV/file | Either |
Once the source is selected, the orchestration service constructs a targeted prompt:
SYSTEM: You are a DAX expert. Generate a DAX EVALUATE query for the semantic model below. Return ONLY the DAX, no explanation. MODEL CONTEXT: Tables: {relevant_tables} Measures: {relevant_measures} Business glossary: {term_definitions} FEW-SHOT EXAMPLES: Q: "current PMPM by LOB" A: EVALUATE SUMMARIZECOLUMNS(...) USER QUESTION: {user_question}
Key principles:
Execute with a row limit (e.g., 100 rows). Return a preview table and chart suggestion. No side effects.
Execute the full query and export results to CSV or a Power BI embedded report URL. Requires explicit user confirmation.
The orchestration service maintains conversation context so follow-up questions resolve correctly:
Turn 1: "Show PMPM by region" → Executes: SUMMARIZECOLUMNS('Dim Region'[Region], "PMPM", [PMPM]) Turn 2: "Break that down by LOB" → Resolves "that" to previous PMPM-by-region query → Adds LOB dimension to the existing query
Memory is scoped to the session and automatically cleared when the chat window closes.