Engram: The Reasoning Layer for Long-Horizon Finance Agents

HOW IT WORKS

We capture tacit human reasoning for open-ended tasks, starting with finance.

1

Expert records a session

A senior analyst works through a real financial problem, reasoning aloud while their screen is captured.

2

Engram decomposes the reasoning

Cognitive events are detected, timestamped, and cross-referenced against screen behavior.

3

Your models learn how domain experts reason

Training-ready JSONL with temporal signals, screen context, and multi-scale reasoning chains.

1

Expert records a session

A senior analyst works through a real financial problem, reasoning aloud while their screen is captured.

00:23:41 CONCERN conf 0.94

"Wait — the EBITDA margin assumption doesn't hold if you back out the one-time items. Let me go back to the source filing..."

pause: 4.2s · revision_triggered: true

2

Engram decomposes the reasoning

Cognitive events are detected, timestamped, and cross-referenced against screen behavior.

00:24:03 INVESTIGATION conf 0.89

"OK so looking at the 10-K, note 14... yeah, there's a $12M restructuring charge they're adding back. That's aggressive."

pause: 0.8s · evidence_type: "regulatory_filing"

3

Your models learn how domain experts reason

Training-ready JSONL with temporal signals, screen context, and multi-scale reasoning chains.

00:25:18 CONCLUSION conf 0.97

"So the real margin is closer to 18%, not 24%. That changes the entry multiple meaningfully — I'd want to see 6x, not 8x."

judgment_revised: true · delta: -25% margin

            1 reasoning arc · 3 events · 97s elapsed · confidence 0.94 → 0.97
          

THE PROBLEM

Building AI for open-ended domains requires higher-quality training data

01

Errors compound exponentially in multi-step reasoning

Across long tasks, small differences in data accuracy compound into large divergences in overall task success. The gap between 99% and 99.9% accuracy is the gap between a model that gets complex tasks right occasionally (36% success) and one that reliably does (90% success).

P(Success) = Accuracy^Steps

99% per-step accuracy 99.9% per-step accuracy

02

Scaling compute and data volume shows diminishing returns on reasoning tasks

Neural scaling laws (Kaplan et al., 2020) predict smooth performance gains from increasing parameters, data, and compute. Recent empirical results show these gains plateau on tasks requiring multi-step reasoning and domain-specific judgement. Performance on benchmarks like MATH and GPQA saturates well before power-law extrapolation would predict, suggesting the bottleneck has shifted from scale to the nature of the training signal itself.

03

Reinforcement learning cannot solve open-ended reasoning

In formally verifiable domains like code execution and mathematical proof, reinforcement learning from automated feedback can systematically improve model performance. Financial reasoning is ill-defined: there is no ground truth for whether a valuation assumption is reasonable, a risk assessment is appropriately weighted, or a deal thesis will hold over a five-year horizon. These judgements depend on tacit expertise that cannot be reduced to a reward signal.

04

Curation improves signal but cannot generate missing data

Data curation through filtering, deduplication, and quality scoring extracts more value from existing corpora. It operates exclusively on what has already been documented. Every available financial training corpus captures outcomes: the finished memo, the completed model, the final recommendation. The reasoning process that produced those outputs, including hypothesis formation, evidence weighing, and revision under uncertainty, was never recorded.

WHAT ENGRAM PROVIDES

Reasoning traces captured in real time, structured for model training.

Existing data providers ask experts to catalogue their reasoning retrospectively. The result is a reconstruction, shaped by hindsight, narrative instinct, and the impulse to sound coherent. Engram captures reasoning as it happens.

unit_event.json — single cognitive event

{
  "event_type": "CONCERN",
  "start_ts": 1421.3,
  "end_ts": 1438.7,
  "transcript": "Wait — the EBITDA margin assumption doesn't hold if you back out the one-time items...",
  "confidence": 0.94,
  "pause_delay": 4.2,
  "screen_state": "event_1421.300.jpg",
  "revision_triggered": true,
  "label": "EBITDA margin challenge"
}

event_type

Cognitive event — "CONCERN" — classified in real time based on the expert's behavioral cues.

pause_delay: 4.2s

A 4-second pause before speaking. In retrospective annotation, this signal, uncertainty forming before articulation, is invisible.

screen_state

The exact screen the expert was looking at when they spoke, cross-referenced against their voice narration to produce self-verifying data.

revision_triggered

This event caused the expert to revise a prior conclusion. Existing annotation platforms don't capture when and why judgement shifts.

THE TECHNOLOGY

Live expert reasoning and on-screen behavior is captured to generate fine-tuning training bundles

A recorded expert session is decomposed into structured cognitive events — timestamped, cross-referenced against screen behavior, and validated against behavioral signals. Two outputs: explainability dashboards for human review, and SFT bundles, including preference signals, for post-training.

Deterministic Multimodal Reproducible

Local

Expert Session

Voice + screen captured concurrently

›

Event detection Frame extraction Temporal enrichment

1 API Call

Cognitive Event Mapping

Structured event data with temporal signals

Local

Human Dashboards

Explainability & session analytics

Local

SFT Training Bundle

Multi-scale, multimodal JSONL

One LLM call per session. Everything else is local and deterministic.

Codifying wisdom into reasoning trace packages provides three compounding advantages.

Higher training signal

Each session produces three interlocking training signals: event classification (per-event cognitive typing), reasoning chains (multi-step logic reconstruction), and session synthesis (full-session analysis). The deliberative structure — hesitation, revision, hypothesis rejection — is encoded directly in the trace.

Multimodal grounding

Every training example embeds the screenshot the expert was looking at when they spoke — base64-encoded inline via VISION format. The model learns situated reasoning: cognition grounded against the specific financial data on screen. Voice cross-references visual context, making the data self-verifying by design.

Compounding domain schema

Each session refines the reasoning format itself — accumulating branching patterns, edge-case heuristics, and decision-point taxonomies that later sessions build upon. Session 200 is structurally richer than session 10. The corpus compounds in depth, not just volume.

HOW EXPERTS EARN

Cement your legacy into an intellectual endowment

Existing compensation architecture for AI training data decouples expert contribution from downstream value creation. In the classic gig work model, an expert's judgment compounds indefinitely inside the models it trains, while the expert captures none of that compounding value.

The heuristics refined across years of practice are an intellectual legacy. We're aiming to cement and preserve this.

Through participating in Engram, the knowledge of the industry's most capable practitioners is preserved in structured form, and returned to them, with every model that learns from it.

Incumbent model

$150/hr

One-time extraction.
No residual claim.

Compensation is indexed to time, not to the epistemic value of the contribution.

Engram model

Royalty

Income compounds with demand.

Compensation is indexed to downstream usage: receive a royalty each time your reasoning is licensed. The income stream scales with demand, aligning financial incentives with the long-term value of your intellectual contribution.

For customers, pricing is straightforward. The royalty structure is how we compensate experts on our side and doesn't create downstream obligations for your team.

Interested in contributing?

Become an Expert

WHO IT'S FOR

The data layer powering AI in finance.

Built for any team whose competitive advantage depends on the quality of financial reasoning their models can produce.

01

Financial intelligence platforms

Your agents retrieve documents. Engram teaches them how to reason through what they find: navigating ambiguity, weighing conflicting signals, and building conviction across multi-step workflows. The training layer between your retrieval stack and your model quality ceiling.

02

PE firms and hedge funds

Your best investors' judgment compounds over decades, but your models train on documents, not on the reasoning that produced them. Engram captures how your senior deal team actually evaluates a CIM, stress-tests a thesis, or revises a model, and structures it as proprietary training data.

03

Strategy and advisory firms

When a senior partner leaves, their pattern recognition leaves with them. Engram captures that reasoning in structured form while it's still accessible, turning institutional expertise into a durable, trainable asset that compounds rather than depreciates.

TEAM

Our team and advisors come from