Decision Intelligence·2025·Built

Deterministic Analytics

DecisionGraph

Schema-adaptive analytics without hallucinated SQL

PythonSQLGlotPostgreSQLdbtReactFastAPI

GitHub Live Demo

45 ms

P95 compile + query

98%

Schema coverage (core)

None

Ungoverned SQL paths

Approved metric defs

DecisionGraph is a deterministic analytics system built on an explicit ontology: metrics, dimensions, joins, and grain are first-class. Natural language (or API intents) compile to candidate query plans that must pass schema validation, cardinality checks, and policy gates before execution.

The product thesis is simple: dashboards fail when semantics are implicit. DecisionGraph makes semantics executable—so "revenue by region" always means the same grain, the same filters, and the same time spine.

Highlights:

Semantic layer as code (versioned, reviewable) instead of tribal spreadsheet logic.
No hallucinated SQL: LLMs may propose intent; the executor only runs whitelisted templates and parameter bindings.
Latency-aware planning chooses pre-aggregates when available and falls back transparently.

Inspectable proof

From resume to something you can read

Matches the resume claim: schema introspection and NL-to-SQL constrained by the semantic catalog—question in, compiled SQL out, result grid.

Natural language input

"What was revenue last month by region?"

Resolved to intent (metric, grain, time spine)—not a free-form SQL string.

Compiled SQL (catalog templates only)

compiled.sql

sql

1SELECT d.region,
2       SUM(f.revenue_net) AS revenue_net
3FROM finance.orders_fact f
4JOIN org.dim_region d ON f.region_id = d.region_id
5WHERE f.order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
6  AND f.order_date < DATE_TRUNC('month', CURRENT_DATE)
7GROUP BY d.region
8ORDER BY revenue_net DESC;

Identifiers come from the catalog; executor rejects raw tables/columns outside approved templates.

Example query result

region	revenue_net
North	₹ 2.4M
South	₹ 1.9M
West	₹ 1.1M

Deterministic KPIs: same question → same grain and filters as the governed metric definition.

The challenge

Teams want "ChatGPT for data," but production needs stable definitions, correct joins, and governed access. Text-to-SQL demos look magical until the first wrong join or ambiguous grain.

The challenge is to preserve the speed of natural language while keeping deterministic execution guarantees aligned with enterprise semantics.

Approach

Canonical model: Encode facts, dimensions, and safe join paths in a graph; forbid ambiguous many-to-many traversals unless explicitly declared.
Intent → plan: Parse questions into structured intent objects (metric, slice, time range)—not raw SQL strings.
Validation: Run cardinality estimates, row-level security predicates, and "explain" dry-runs before execution.
Caching & reuse: Store signed query plans per intent hash so repeat questions hit compiled SQL, not the planner.

System architecture

Question / API

Intent object

Semantic catalog

Validator

Compiled SQL

Warehouse

Input

Process

Model

Storage

Output

Failure modes

01
Under-specified business terms map to the wrong metric ID—mitigated with explicit confirmation for ambiguous matches.
02
Warehouse optimizer quirks can skew P95; surfaced via plan fingerprints and regression tests per template.
03
Role-based entitlements drift from warehouse reality—sync job must be observable.

Trade-offs

01
More upfront modeling work than a vanilla text-to-SQL toy; pays off in correctness and trust.
02
Curated templates limit exotic ad-hoc queries—power users export to governed notebooks instead.
03
Stricter planner means slower feature velocity on day one, faster on day 100.

Implementation details

compile.py

Intent must resolve to a whitelisted template

python

compile.py

🐍python

Ownership

Designed

Owned the semantic catalog schema, join safety rules, and intent/plan separation.

Implemented

Implemented the compiler pipeline, validation layer, and warehouse execution adapters.

Scrapped

End-to-end neural SQL—replaced with constrained generation over approved templates for reliability.

TraceAI

Walks pipelines backward to find what actually caused a failure

NormaGraph

Policy reasoning as a structured graph, not prose