ENGRAM
Thesis Technology Get in Touch

Superhuman Data
for Finance

The reasoning layer for long-horizon finance agents.

Schedule a Conversation Become an Expert
scroll
HOW IT WORKS

We capture tacit human reasoning for open-ended tasks, starting with finance.

1
Expert records a session
A senior analyst works through a real financial problem, reasoning aloud while their screen is captured.
2
Engram decomposes the reasoning
Cognitive events are detected, timestamped, and cross-referenced against screen behavior.
3
Your models learn how domain experts reason
Training-ready JSONL with temporal signals, screen context, and multi-scale reasoning chains.
1
Expert records a session
A senior analyst works through a real financial problem, reasoning aloud while their screen is captured.
Expert reviewing CIM cover page
00:23:41 CONCERN conf 0.94
"Wait — the EBITDA margin assumption doesn't hold if you back out the one-time items. Let me go back to the source filing..."
pause: 4.2s · revision_triggered: true
2
Engram decomposes the reasoning
Cognitive events are detected, timestamped, and cross-referenced against screen behavior.
Expert reviewing financial tables
00:24:03 INVESTIGATION conf 0.89
"OK so looking at the 10-K, note 14... yeah, there's a $12M restructuring charge they're adding back. That's aggressive."
pause: 0.8s · evidence_type: "regulatory_filing"
3
Your models learn how domain experts reason
Training-ready JSONL with temporal signals, screen context, and multi-scale reasoning chains.
Expert reviewing financial projections
00:25:18 CONCLUSION conf 0.97
"So the real margin is closer to 18%, not 24%. That changes the entry multiple meaningfully — I'd want to see 6x, not 8x."
judgment_revised: true · delta: -25% margin
1 reasoning arc · 3 events · 97s elapsed · confidence 0.94 → 0.97
THE PROBLEM

Building AI for open-ended domains requires higher-quality training data

01
Errors compound exponentially in multi-step reasoning
Across long tasks, small differences in data accuracy compound into large divergences in overall task success. The gap between 99% and 99.9% accuracy is the gap between a model that gets complex tasks right occasionally (36% success) and one that reliably does (90% success).
P(Success) = AccuracySteps
99% per-step accuracy 99.9% per-step accuracy
TASK PERFORMANCE TRAINING COMPUTE (FLOPs) high low retrieval reasoning predicted Scaling up compute improves retrieval, but reasoning hits a performance ceiling without better data.
02
Scaling compute and data volume shows diminishing returns on reasoning tasks
Neural scaling laws (Kaplan et al., 2020) predict smooth performance gains from increasing parameters, data, and compute. Recent empirical results show these gains plateau on tasks requiring multi-step reasoning and domain-specific judgement. Performance on benchmarks like MATH and GPQA saturates well before power-law extrapolation would predict, suggesting the bottleneck has shifted from scale to the nature of the training signal itself.
03
Reinforcement learning cannot solve open-ended reasoning
In formally verifiable domains like code execution and mathematical proof, reinforcement learning from automated feedback can systematically improve model performance. Financial reasoning is ill-defined: there is no ground truth for whether a valuation assumption is reasonable, a risk assessment is appropriately weighted, or a deal thesis will hold over a five-year horizon. These judgements depend on tacit expertise that cannot be reduced to a reward signal.
GAIN FROM AUTOMATED RL FEEDBACK Code high Math moderate Legal limited Finance minimal Automated feedback requires verifiable outcomes. Financial judgement has no programmatic reward signal. Human-in-the-loop RL helps, but plateaus.
waterline DOCUMENTED final memos completed models NEVER RECORDED hypothesis formation evidence weighing revision under uncertainty cross-referencing backtracking confidence calibration pause-before-judgement ~15% ~85% Curation operates above the waterline only
04
Curation improves signal but cannot generate missing data
Data curation through filtering, deduplication, and quality scoring extracts more value from existing corpora. It operates exclusively on what has already been documented. Every available financial training corpus captures outcomes: the finished memo, the completed model, the final recommendation. The reasoning process that produced those outputs, including hypothesis formation, evidence weighing, and revision under uncertainty, was never recorded.
WHAT ENGRAM PROVIDES

Reasoning traces captured in real time, structured for model training.

Existing data providers ask experts to catalogue their reasoning retrospectively. The result is a reconstruction, shaped by hindsight, narrative instinct, and the impulse to sound coherent. Engram captures reasoning as it happens.

unit_event.json — single cognitive event
{
  "event_type": "CONCERN",
  "start_ts": 1421.3,
  "end_ts": 1438.7,
  "transcript": "Wait — the EBITDA margin assumption doesn't hold if you back out the one-time items...",
  "confidence": 0.94,
  "pause_delay": 4.2,
  "screen_state": "event_1421.300.jpg",
  "revision_triggered": true,
  "label": "EBITDA margin challenge"
}
event_type
Cognitive event — "CONCERN" — classified in real time based on the expert's behavioral cues.
pause_delay: 4.2s
A 4-second pause before speaking. In retrospective annotation, this signal, uncertainty forming before articulation, is invisible.
screen_state
The exact screen the expert was looking at when they spoke, cross-referenced against their voice narration to produce self-verifying data.
revision_triggered
This event caused the expert to revise a prior conclusion. Existing annotation platforms don't capture when and why judgement shifts.
THE TECHNOLOGY

Live expert reasoning and on-screen behavior is captured to generate fine-tuning training bundles

A recorded expert session is decomposed into structured cognitive events — timestamped, cross-referenced against screen behavior, and validated against behavioral signals. Two outputs: explainability dashboards for human review, and SFT bundles, including preference signals, for post-training.

Deterministic Multimodal Reproducible
Local
Expert Session
Voice + screen captured concurrently
›
Event detection Frame extraction Temporal enrichment
1 API Call
Cognitive Event Mapping
Structured event data with temporal signals
Local
Human Dashboards
Explainability & session analytics
Local
SFT Training Bundle
Multi-scale, multimodal JSONL
One LLM call per session. Everything else is local and deterministic.

Codifying wisdom into reasoning trace packages provides three compounding advantages.

Higher training signal

Each session produces three interlocking training signals: event classification (per-event cognitive typing), reasoning chains (multi-step logic reconstruction), and session synthesis (full-session analysis). The deliberative structure — hesitation, revision, hypothesis rejection — is encoded directly in the trace.

Multimodal grounding

Every training example embeds the screenshot the expert was looking at when they spoke — base64-encoded inline via VISION format. The model learns situated reasoning: cognition grounded against the specific financial data on screen. Voice cross-references visual context, making the data self-verifying by design.

Compounding domain schema

Each session refines the reasoning format itself — accumulating branching patterns, edge-case heuristics, and decision-point taxonomies that later sessions build upon. Session 200 is structurally richer than session 10. The corpus compounds in depth, not just volume.

HOW EXPERTS EARN

Cement your legacy into an intellectual endowment

Existing compensation architecture for AI training data decouples expert contribution from downstream value creation. In the classic gig work model, an expert's judgment compounds indefinitely inside the models it trains, while the expert captures none of that compounding value.

The heuristics refined across years of practice are an intellectual legacy. We're aiming to cement and preserve this.

Through participating in Engram, the knowledge of the industry's most capable practitioners is preserved in structured form, and returned to them, with every model that learns from it.

Incumbent model
$150/hr
One-time extraction.
No residual claim.
Compensation is indexed to time, not to the epistemic value of the contribution.
Engram model
Royalty
Income compounds with demand.
Compensation is indexed to downstream usage: receive a royalty each time your reasoning is licensed. The income stream scales with demand, aligning financial incentives with the long-term value of your intellectual contribution.

For customers, pricing is straightforward. The royalty structure is how we compensate experts on our side and doesn't create downstream obligations for your team.

Interested in contributing?

Become an Expert
WHO IT'S FOR

The data layer powering AI in finance.

Built for any team whose competitive advantage depends on the quality of financial reasoning their models can produce.

01
Financial intelligence platforms
Your agents retrieve documents. Engram teaches them how to reason through what they find: navigating ambiguity, weighing conflicting signals, and building conviction across multi-step workflows. The training layer between your retrieval stack and your model quality ceiling.
02
PE firms and hedge funds
Your best investors' judgment compounds over decades, but your models train on documents, not on the reasoning that produced them. Engram captures how your senior deal team actually evaluates a CIM, stress-tests a thesis, or revises a model, and structures it as proprietary training data.
03
Strategy and advisory firms
When a senior partner leaves, their pattern recognition leaves with them. Engram captures that reasoning in structured form while it's still accessible, turning institutional expertise into a durable, trainable asset that compounds rather than depreciates.
TEAM

Our team and advisors come from

McKinsey & Company
Harvard University
JPMorgan Chase
MIT
Bain & Company
UC Berkeley
Cornell Tech
Morgan Stanley
Mastercard
Cosmos Institute
GET IN TOUCH

Interested in Engram?

We're working with a select group of design partners. Schedule a conversation to learn more.

Schedule a Conversation Try a Demo
ENGRAM · New York, NY · 2026 · Privacy · Terms