Multi-Agentic Research Platform

Answers research questions only with claims it can verify — every stage traced, every citation grounded.

project overview

Multi-Agentic Research Platform is an evidence-grounded AI research system built around a five-stage agent pipeline: Planner, Retriever, Writer, Critic, and Verifier. The platform decomposes research queries into retrieval steps, performs vector search using PostgreSQL + pgvector, generates grounded answers from retrieved evidence, verifies claims against sources, and returns full execution traces with per-stage metrics. Built with FastAPI, PostgreSQL + pgvector, Gemini embeddings, TypeScript, Docker, and realtime SSE streaming.

architecture

The Planner turns the question into a structured retrieval plan (typed PlanStep objects: sub-question plus search query). The Retriever runs cosine-similarity search against PostgreSQL with pgvector — embeddings generated through Gemini's embedContent API — and returns ranked chunks with source metadata and similarity scores. The Writer drafts from evidence, the Critic challenges the draft, and the Verifier checks claims before release; the Critic→Writer loop repeats until confidence clears the bar. Every agent emits typed trace events.

constraints

evidence grounding — no claim ships without a retrieval trail behind it
bounded iteration — the critique loop must converge or stop at a hard cap, never spin
LLM output fragility — structured JSON from a model cannot be assumed valid

tradeoffs

five single-responsibility agents over one omnibus prompt: more inference calls per question, but each stage emits typed traces and can be replaced without retraining the others
loop-until-confident over single-pass answers: response latency deliberately spent on claim-level verification, bounded by a hard iteration cap so the spend cannot run away
pgvector inside Postgres over a managed vector service: one database, one operational surface, one failure domain to observe

failure notes

the Planner's JSON parsing can fail on malformed LLM output — it degrades to treating the raw output as a single search query rather than aborting the run
the Retriever returns an empty list gracefully when the vector store has nothing — downstream stages handle absence of evidence as a first-class state
the Retriever currently executes only the first PlanStep of a multi-step plan — a known limit, preserved in the trace rather than papered over

infrastructure

python · postgres + pgvector · gemini embeddings · typescript · docker

engineering reasoning

Designed around single-responsibility agents with typed contracts and execution traces so each stage can be debugged, replaced, or evaluated independently. The system prioritizes evidence grounding, bounded iteration, and observable pipeline state over single-pass generation.

future work

> execute the full retrieval plan, not just its first step
> confidence calibration against held-out questions

< return to vaultEND OF RECORD · PROJ.MARP / 01