Live — The Proof Page | Manthan Intelligence

In production at a venture fund — running its autonomous analytical, research, and operational flows. This is that pod, anonymised. Every number below is dated and sourced; the misses are published alongside the wins.

113,541 Knowledge-graph entities as of 23 Jul · KG

62.8% Weighted backtest accuracy
published including the misses as of 23 Jul · backtest

2,441 Calls scored against real outcomes as of 23 Jul · scorecards

49 Agents in production · 24.5:1 vs humans as of 23 Jul · registry

High-conviction call reliability by stage

How often the system's highest-conviction calls turned out right, broken down by the company's funding stage at the time of the call.

Seed

38.5%5/13

Series A

75%21/28

Series B

60.5%26/43

Series C

65.9%29/44

Series D

58.3%14/24

Series E

72.2%13/18

Series F

72.7%8/11

Series G+

50%2/4

Growth

70.5%31/44

Pre-IPO

92.9%26/28

Acquired

34.6%9/26

Full 13-stage table — every stage, measured separately

Stage	Calls scored	High-conviction	Reliability	Mean accuracy	Funded up	Stalled/dead
Pre-Seed	8	—	—	81.2%	0%	38.1%
Seed	64	5/13	38.5%	58.2%	11.3%	28.9%
Series A	73	21/28	75%	60.5%	20.3%	23%
Series B	69	26/43	60.5%	55.4%	25.9%	15.6%
Series C	58	29/44	65.9%	58.4%	33.3%	11.4%
Series D	34	14/24	58.3%	50%	26%	6.8%
Series E	25	13/18	72.2%	66.6%	35.7%	11.9%
Series F	13	8/11	72.7%	75.8%	29.6%	11.1%
Series G+	6	2/4	50%	47.5%	23.1%	7.7%
Growth	77	31/44	70.5%	60.7%	26.2%	12.8%
Pre-IPO	38	26/28	92.9%	62.2%	68%	0%
Public	77	46/47	97.9%	57.7%	81%	5.1%
Acquired	201	9/26	34.6%	59.1%	4.2%	2.8%

Each company's stage comes from its real funding history at the time of the call — not a label we picked by hand.
Outcomes aren't always public right away: a fresh raise or a stalled company is easy to see, but a quiet acqui-hire or down-round can take a while to surface — so the late-stage numbers are a floor, not a ceiling.

Source: per_stage_accuracy.py over kg/calibration/scorecards · as of 23 Jul

The day's heartbeat

Each dot is a task that ran on schedule, with no human starting it, placed by the time of day it fired. The clock is a 24-hour face — the sweep hand is right now, and the threads running into the centre are work routed to the one human.

UTC--:--:--

IST · Bengaluru--:--:--

PT · San Francisco--:--:--

Now firing Listening…

0 fired today · 0 upcoming

Firing now (±15min) Fired today Upcoming

Data Engineering Analytical Council Finance & Capital Engineering Product GTM Founder's Associate Consulting

The system grades itself — daily, in public

Accuracy week by week, split by how confident the system was when it made each call. The takeaway: the highest-conviction calls stay above 90%, while overall accuracy holds in the low-to-mid 60s — and every figure is dated and published with its misses. The three middle columns are each the accuracy within that confidence tier.

66.5% Decline rate (target 60-70%) 4,295 / 6,460

126 Lessons banked in the learning graph as of 23 Jul

85.1% High-conviction call reliability as of 23 Jul

Week	Weighted acc.	High-conviction	Lean-aligned	Declined-call acc.	New cohort
This week (11 Jun)	64.6%	93.4%	64.2%	64.2%	68.85%
Prior week (7 Jun)	65.21%	94.7%	56.5%	81.8%	68.85%
31 May	62.98%	96.7%	62.2%	78.1%	64.22%
10 May	63.05%	96%	66%	72%	66.0%
23 Apr (baseline)	67.7%	93.1%	—	76.5%	—

Source: weekly-calibration-sweep (published backtest trend strip) · as of 11 Jun · Weekly snapshots — refreshed by the calibration sweep, not recomputed daily.

Wins and misses, side by side

Last 15 of 2441 backtest calls on private companies. Each was made from publicly available information only and checked against what actually happened — none are deals the Fund is currently evaluating. The system was right 62.8% of the time across all 2441. Outcomes show green when the call was right, red when it missed — no cherry-picking.

Company	Call	Outcome
Unknown	Lean Aligned	Stalled
Unknown	Lean Decline	Wound down
Unknown	Lean Decline	Stalled
Unknown	Lean Aligned	Stalled
Unknown	Lean Decline	Stalled
Unknown	Lean Decline	Stalled
Unknown	Lean Aligned	Stalled
Unknown	Lean Aligned	Stalled
Unknown	Lean Aligned	Stalled
Unknown	Lean Decline	Stalled
Unknown	High Alignment	Raised up
Unknown	Lean Decline	Stalled
Unknown	Lean Aligned	Raised flat
Unknown	Lean Decline	Stalled
Unknown	Lean Decline	Stalled

What the system has learned from breaking

Recent entries from the failure log. Each is a real bug, missed call, or process regression — captured so the next iteration doesn't repeat it.

2026-07-15 Classification quality open

First full-corpus (13,108-entity) deterministic NFX/secondary sweep since priority-head saturation surfaces a…

The priority-head saturation finding (learning-018 followup, confirmed weekly since 28 Jun) was correctly read as 'no more easy positives at the top of the queue' but was NOT sufficient evidence that 'the deterministic method is safe to run at full-corpus scale.' The long tail…
2026-07-04 Entity quality fixed

8,073 companies carried un-normalised country/hq_country strings ("United States", "US", "United Kingdom",…

A validation gate (validate_entity()) that is correct in isolation does not retroactively clean a corpus unless something also runs it as a backfill sweep.
2026-06-13 Entity quality auto_remediated_aliases_open_followups

Sector-VALUE vocabulary drift: 222 alias sector values (Ecommerce, HealthTech, EdTech, CleanTech, PropTech,…

A validation gate only protects writes that route through it. Sector aliases that the live normalise_sector resolves were still present in 222 entities — proof a bulk writer skipped the gate.
2026-06-08 Classification quality resolved

Root cause corrected: the 'Mental Health' contamination was the sector classifier assigning a sector to…

The 'Mental Health default' was a misdiagnosis; the contamination is the universal failure of classifying from prose-less text, and the durable fix is the SAME prose-density eligibility gate that fixed NFX (learning_020), now applied to sector.
2026-06-08 Classification quality resolved_by_gate_stop

The learning-018 topic-adjacency failure also lives on the secondary_sector path: the deterministic…

The 8 Jun kg-nfx-secondary-classify run dry-ran the deterministic classifier (positives-only) over the top-2,000 unclassified companies. It proposed 1 network_effect_type positive and 139 secondary_sector labels.
2026-06-07 Data quality open

QA-flag taxonomy drift: enrichment passes write data-quality concerns under >=6 non-canonical key names, but…

A self-healing correction loop is only as good as the agreement between what writers emit and what the scanner reads. When the two vocabularies drift apart, the loop reports CLEAN while real errors accumulate.

Showing 6 of 39 publicly-shareable entries · 126 total in the learning graph

The Prediction Ledger

Before a company event happens, the system writes down what it expects, time-stamps it, and locks it with a tamper-proof fingerprint (a SHA-256 hash) published here up front. The full prediction stays hidden until the outcome is known — then it's revealed, hit or miss alike. Nothing is ever added after the fact, and misses are published when their window closes.

Next batch seals this week — its hashes appear here that day.

25Sealed & open

0Resolved

—Verified hit rate

—Median lead time

—Cohort base rate

Sealed	Hash (SHA-256)	Status	Resolution
2026-07-20	`e6649dad8d15…`	SEALED	6-month window · open to 2027-01-20
2026-07-20	`e00d0e1ff812…`	SEALED	6-month window · open to 2027-01-20
2026-07-20	`7f9cdea873bb…`	SEALED	6-month window · open to 2027-01-20
2026-07-20	`4358c2a4bcf4…`	SEALED	6-month window · open to 2027-01-20
2026-07-13	`b1b177f9955f…`	SEALED	6-month window · open to 2027-01-13
2026-07-13	`a4ee2fd13f86…`	SEALED	6-month window · open to 2027-01-13
2026-07-13	`659d96b92434…`	SEALED	6-month window · open to 2027-01-13
2026-07-13	`0d0370f9b8db…`	SEALED	6-month window · open to 2027-01-13
2026-07-06	`85d1a3142376…`	SEALED	6-month window · open to 2027-01-06
2026-07-06	`82e9809dd912…`	SEALED	6-month window · open to 2027-01-06
2026-07-06	`4444e7b5e9ba…`	SEALED	6-month window · open to 2027-01-06
2026-07-06	`2099555161b6…`	SEALED	6-month window · open to 2027-01-06
2026-06-29	`b160d0fa47a9…`	SEALED	6-month window · open to 2026-12-29
2026-06-29	`69dfe2f5c6df…`	SEALED	6-month window · open to 2026-12-29
2026-06-29	`358c1eba3149…`	SEALED	6-month window · open to 2026-12-29
2026-06-29	`0a2076d5f5b3…`	SEALED	6-month window · open to 2026-12-29
2026-06-22	`a8c732668ca0…`	SEALED	6-month window · open to 2026-12-22
2026-06-22	`5db0c36e2ed6…`	SEALED	6-month window · open to 2026-12-22
2026-06-22	`42a5b0c8fae8…`	SEALED	6-month window · open to 2026-12-22
2026-06-22	`2a7b59113bf0…`	SEALED	6-month window · open to 2026-12-22
2026-06-15	`ec617cb582f7…`	SEALED	6-month window · open to 2026-12-15
2026-06-15	`bfdb86dc526c…`	SEALED	6-month window · open to 2026-12-15
2026-06-15	`6c04088debf1…`	SEALED	6-month window · open to 2026-12-15
2026-06-15	`3a524a408f9f…`	SEALED	6-month window · open to 2026-12-15
2026-06-15	`05a3c0fe5bdb…`	SEALED	6-month window · open to 2026-12-15

Verify independently: the raw feed lives at /live/predictions.json. Each hash is SHA-256 over the prediction's canonical JSON (sorted keys, compact separators, fields beginning with "_" excluded). The git commit and Cloudflare deploy history of that file are the proof-of-date — not our word.

The memory compounds

The knowledge graph is the moat — it gets denser every night. These are live counts, not a brochure.

14,547Companies tracked

79,628Relationships modelled

498Deal postmortems

1,212Analyses run

326Reusable insights

Source: kg_stats.json · as of 23 Jul

Our readers include other AIs

Humans land in analytics. Agents leave a different trace — they fetch the machine-readable surfaces (/llms.txt, /agent.json). We count those at the edge, no PII. Last 30 days.

1,582High-confidence agent reads (agent-only surfaces)

6,393Classified agent reads (assistants + crawlers)

By vendor

OpenAI 1,349Microsoft 858Anthropic 633Google 603Perplexity 164Other 2,737

Source: edge agent-traffic counters (counts only, no PII) · as of 23 Jul · human visits live in GA4.

What you're looking at

This is not a mock-up. The clock above is your local browser's clock mapped onto a 24-hour face. Each dot is a scheduled task running inside an actual venture fund. As you read this, an agent on that ring has just fired, or is about to.

The fund is two humans (partner + analyst) and 49 agents across eight divisions — research, deal flow, analytical deliberation, financial modelling, engineering, product, GTM, and consulting — each a small panel of specialised agents that disagree, deliberate, and produce structured outputs. Threads of work pass between them; the output lands on the partner's desk; the next loop starts.

The accuracy figure is the system grading its own analytical history — 2441 calls scored against real outcomes under strict blind methodology (no outcome data at scoring time). It is published with its misses and it improves week by week. The learning entries are the failures the system chose to remember.

24 hours, compressed. Same fund. Same day. Open full Day-Loop page →

Where every number comes from

Section	Source	Refresh
Headline strip	`kg_stats.json + scorecard_feed.json`	daily build
Stage composability	`per_stage_accuracy.py`	daily build (weekly recompute)
Heartbeat	`firings_source.json → firings.json`	weekly (live-feed-refresh)
Reliability + decline	`reliability_metrics.py (SR-33)`	daily build
Accuracy trend	`weekly-calibration-sweep`	weekly
Memory growth	`kg_stats.json`	daily build
Agents reading	`edge AGENT_COUNTERS KV + GA4`	daily build

Feed produced by build_health_feed.py · KG counts refreshed each build · last build 2026-07-23 06:06:42 UTC