An LLM acting on unverified data is a regulatory liability, not a productivity tool. Northpoint puts autonomous copilots on top of a signed trust layer — every proposal grounded, every action audit-defensible.
Every proposal lands as a reviewable artifact — a rule, alias, draft runbook, narrative — never a direct mutation.
Privacy is first-class
PII redaction before any LLM send. BYO key (Anthropic / OpenAI / Bedrock / Vertex / Azure). On-device option for high-security customers.
Trust-layer-first
Every claim points at lineage rows. Proposals without citations are bugs. The same contract the MCP server already enforces.
The four copilots.
Each one is a thin layer over an existing surface in the product. None of them is a new domain — they put an LLM-driven assistant on top of what's already shipping.
Anomaly explainer
Phase 4 · weeks 7–9
Where it lives
Inbox finding detail panel
Pain it solves
Operator gets a finding (`btc.csv` row 42 column `qty` outside expected range) and has to manually walk lineage to figure out what changed and why.
Trust contract
Routes through the existing override mechanism with the LLM's draft pre-populated. The deterministic accept is what enters the audit chain.
What it does
Reads the finding's subject (dataset, row, column, severity).
Pulls the field's last N observations across sources, via the existing lineage trace.
Pulls the column's profile (typical range, distribution shape).
Surfaces recent similar findings via the similar-incident matcher.
Drafts a 2–3 sentence operator-language explanation. Every numeric claim has a lineage ref next to it.
Signed proposal
Yahoo reported qty=12,500 for AAPL on 2026-05-21 — 8.3× the column's typical value. Alpaca and IBKR both reported qty=1,500 on the same day. Most likely a Yahoo decimal-place error (similar issue resolved 2026-04-12, incident #218). Recommended: ignore Yahoo for this row, golden value stays at 1,500.
Break investigator
Phase 5 · weeks 9–11
Where it lives
Reconciliation mismatch detail
Pain it solves
A position break is reported. Today: pull both source datasets, eyeball deltas, walk back through recon runs, hypothesise about timing or source drift, write up findings. Hours of manual archaeology.
Trust contract
The packet's signature covers the lineage hash chain at the moment of investigation.
What it does
Starts from a reconciliation mismatch row.
For each diverging field, identifies which source contributed which value.
Cross-references the source-reliability tier and recent uptime history.
Checks whether either source had a failed_refresh or schema-drift finding around the same window.
Outputs a signed evidence packet (PDF + JSON) — lineage refs, reliability trends, diagnostic narrative.
Signed proposal
Alpaca file arrived 2h late on 2026-05-20; the recon ran against stale cached bytes. Packet signed with the chain hash at investigation time — auditors replaying the packet later can verify nothing was tampered with.
Corporate-actions copilot
Phase 6 · weeks 11–14
Where it lives
Control Room · Corporate Actions
Pain it solves
Corp action notifications (splits, dividends, mergers, spinoffs) require position adjustments across multiple broker accounts. Today the ops desk reads the notification, looks up holdings per account, applies the adjustment, hopes nothing contradicts. Errors are silent until the next recon.
Trust contract
The notification ID is part of the approval's evidence chain. Auditors can trace every adjustment back to its source notification and the approving user.
What it does
Ingests notifications (IBKR CA feed, DTCC, or uploaded PDFs via the existing OCR pipeline).
For each affected security, pulls current positions per broker via lineage trace — signed values.
Drafts the adjustment per account.
Surfaces conflicts as blocking: divergent record dates, missing positions, tax-lot differences.
Queues the adjustment as an approval; once approved, the deterministic engine applies it.
Signed proposal
AAPL 4-for-1 split: IBKR 1,200 → 4,800, Alpaca 800 → 3,200. Conflict: divergent record dates between sources. Operator review required before approval can complete.
Northpoint Q
Phase 7 · weeks 14–16
Where it lives
Top of Control Room · persistent across tabs
Pain it solves
Operators have questions they can't easily answer through point-and-click: which datasets had the most finding-volume this week, are there any sources whose reliability is trending down, what's the longest-open incident in the inbox right now.
Trust contract
Every numeric answer has a 'show evidence' expand that reveals the lineage refs and the RAG engine's underlying query.
What it does
Natural-language question bar at the top of the Control Room.
Powered by RAG over catalog + lineage + incidents + findings + runbook history.
Answer cards are short, specific, and cite signed provenance for every claim.
Inspired by Amazon QuickSight Q — but oriented around data-ops, not BI.
Also exposed as an MCP tool: ops.triage_query(question) — external LLM clients can call into it.
Signed proposal
Q: Which datasets had the most finding-volume this week? A: prices.alpaca (38), prices.yahoo (31), recon.daily (24). Show evidence → expands to the underlying lineage refs and the RAG query that ran.
AI-proposed anomaly criteria · Phase 3
The LLM doesn’t invent rules. It picks from the catalogue.
For every (dataset, column) pair, the system asks: given this column's name, type, distribution, sample values, and the dataset's known purpose — what anomaly criteria would catch real problems without false positives?
The LLM returns a structured proposal: zero or more deterministic rules from the catalogue (foreign_key, sums_to, value_in_set, non_null_streak, monotone, range) with concrete thresholds. Operator one-click confirms — the rule becomes a real data_quality_rules row.
1
LLM proposes
Operator-explainable rule from the catalogue.
2
Approvals queue
Lands as a new approval kind, with reasoning and sample distribution.
3
Operator confirms
One click → real data_quality_rules row.
4
Engine enforces
Deterministic integrity engine runs it on the next tick.
5
Picks feed back
Operator's accept/reject history shapes the next round.
Phased roadmap.
Each phase is 2–3 weeks of focused work. Phases overlap where they can.