AI · the agentic data trust layer

Agents that propose.
Operators who decide.

An LLM acting on unverified data is a regulatory liability, not a productivity tool. We put autonomous copilots on top of a signed trust layer — every proposal grounded, every action audit-defensible.

Talk to us See the trust layer

LLM proposes, engine enforces

Every proposal lands as a reviewable artifact — a rule, alias, draft runbook, narrative — never a direct mutation.

Privacy is first-class

PII redaction before any LLM send. BYO key. On-device option for high-security customers.

Trust-layer-first

Every claim points at lineage rows. Proposals without citations are bugs.

The four copilots.

Each one is a thin layer over an existing surface in the product. None of them is a new domain — they put an LLM-driven assistant on top of what's already shipping.

Anomaly explainer

Phase 4 · weeks 7–9

Where it lives

Inbox finding detail panel

Pain it solves

Operator gets a finding (btc.csv row 42 column qty outside expected range) and has to manually walk lineage to figure out what changed and why.

Trust contract

Routes through the existing override mechanism with the LLM's draft pre-populated. The deterministic accept is what enters the audit chain.

What it does

Reads the finding's subject (dataset, row, column, severity).
Pulls the field's last N observations across sources, via lineage trace.
Pulls the column's profile (typical range, distribution shape).
Surfaces recent similar findings via the similar-incident matcher.
Drafts a 2–3 sentence operator-language explanation with lineage refs on every numeric claim.

Signed proposal

Yahoo reported qty=12,500 for AAPL on 2026-05-21 — 8.3× the column's typical value. Alpaca and IBKR both reported qty=1,500 on the same day. Most likely a Yahoo decimal-place error (similar issue resolved 2026-04-12, incident #218). Recommended: ignore Yahoo for this row, golden value stays at 1,500.

Break investigator

Phase 5 · weeks 9–11

Where it lives

Reconciliation mismatch detail

Pain it solves

A position break is reported. Today: pull both source datasets, eyeball deltas, walk back through recon runs, hypothesise about timing or source drift, write up findings. Hours of manual archaeology.

Trust contract

The packet's signature covers the lineage hash chain at the moment of investigation.

What it does

Starts from a reconciliation mismatch row.
For each diverging field, identifies which source contributed which value.
Cross-references the source-reliability tier and recent uptime history.
Checks whether either source had a failed_refresh or schema-drift finding around the same window.
Outputs a signed evidence packet (PDF + JSON).

Signed proposal

Alpaca file arrived 2h late on 2026-05-20; the recon ran against stale cached bytes. Packet signed with the chain hash at investigation time.

Corporate-actions copilot

Phase 6 · weeks 11–14

Where it lives

Control Room · Corporate Actions

Pain it solves

Corp action notifications (splits, dividends, mergers, spinoffs) require position adjustments across multiple broker accounts. Errors are silent until the next recon.

Trust contract

The notification ID is part of the approval's evidence chain. Auditors can trace every adjustment back to its source.

What it does

Ingests notifications (IBKR CA feed, DTCC, or uploaded PDFs via OCR).
For each affected security, pulls current positions per broker via lineage trace.
Drafts the adjustment per account.
Surfaces conflicts as blocking: divergent record dates, missing positions, tax-lot differences.
Queues the adjustment as an approval; once approved, the deterministic engine applies it.

Signed proposal

AAPL 4-for-1 split: IBKR 1,200 → 4,800, Alpaca 800 → 3,200. Conflict: divergent record dates between sources. Operator review required.

Northpoint Q

Phase 7 · weeks 14–16

Where it lives

Top of Control Room · persistent

Pain it solves

Operators have questions they can't easily answer through point-and-click — which datasets had the most finding-volume this week, are there any sources whose reliability is trending down.

Trust contract

Every numeric answer has a 'show evidence' expand that reveals the lineage refs and the RAG engine's query.

What it does

Natural-language question bar at the top of the Control Room.
Powered by RAG over catalog + lineage + incidents + findings + runbook history.
Answer cards are short, specific, and cite signed provenance for every claim.
Inspired by Amazon QuickSight Q — oriented around data-ops, not BI.
Also exposed as MCP tool ops.triage_query — external LLM clients can call into it.

Signed proposal

Q: Which datasets had the most finding-volume this week? A: prices.alpaca (38), prices.yahoo (31), recon.daily (24). Show evidence → expands to the underlying lineage refs.

AI-proposed anomaly criteria · Phase 3

The LLM doesn’t invent rules.
It picks from the catalogue.

For every (dataset, column) pair, the system asks: given this column's name, type, distribution, sample values, and the dataset's known purpose — what anomaly criteria would catch real problems without false positives?

The LLM returns a structured proposal — zero or more rules from the catalogue (foreign_key, sums_to, value_in_set, non_null_streak, monotone, range). Operator one-click confirms; the rule becomes a real data_quality_rules row.

LLM proposes

Operator-explainable rule from the catalogue.

Approvals queue

Lands as a new approval kind, with reasoning and sample distribution.

Operator confirms

One click → real data_quality_rules row.

Engine enforces

Deterministic integrity engine runs it on the next tick.

Picks feed back

Operator's accept/reject history shapes the next round.

Phased roadmap.

Each phase is 2–3 weeks of focused work. Phases overlap where they can.

Trust layer

Lineage · DQ engines · MCP v1 · Source reliability tiers

shipped

Knowledge layer base

Per-org LLM credentials · Per-column profiles · Embeddings + sqlite-vec · BYO key

weeks 1–3

MCP v2 + RAG answers

datasets.profile · semantic_search · lineage.dataset_overview · findings.search

weeks 3–5

AI-proposed anomaly criteria

LLM proposes deterministic rules from catalogue · approvals queue

weeks 5–7

Anomaly explainer

Operator-language explanation grounded in lineage + similar incidents

weeks 7–9

Break investigator

Lineage walk + reliability cross-ref · signed evidence packet

weeks 9–11

Corp actions copilot

Notification ingest · per-account drafts · blocking conflicts

weeks 11–14

Northpoint Q

Natural-language ops triage · RAG over catalog + lineage · cited answers

weeks 14–16

Agent runtime layer

Generalised agent abstraction · Agent Inbox · provenance chain hash per proposal

weeks 16+

What does NOT belong here

Things we deliberately don’t do.

We stay honest about scope. The roadmap is a long-running commitment, not a feature list.

Auto-generated workflow graphs from prose
ML here would obscure rather than enable.
Bring-your-own-fine-tuned-model
BYO key over public APIs is the right primitive.
End-to-end 'AI runs your ops' pitch
Never. Agents propose, humans accept.
LLM-authored rules outside the catalogue
We extend the catalogue deliberately. The LLM never invents.

Want this layered onto your stack?

Tell us about your sources and what you need to audit-defend. We’ll respond with the shortest path to the trust layer plus the first copilot.

Talk to us

Agents that propose.Operators who decide.

LLM proposes, engine enforces

Privacy is first-class

Trust-layer-first

The four copilots.

Anomaly explainer

Break investigator

Corporate-actions copilot

Northpoint Q

The LLM doesn’t invent rules.It picks from the catalogue.

Phased roadmap.

Things we deliberately don’t do.

Want this layered onto your stack?

Agents that propose.
Operators who decide.

The LLM doesn’t invent rules.
It picks from the catalogue.