Agent Wikis

home / why-wikis

Why wikis?

Today an agent answers from two places: what it memorized in training, and what it finds on the live web. On niche, fast-moving domains, both fail — parametric memory knows nothing, and web search hallucinates half the time. We measured it.

89%
vs 48% web search
correct answers with a compiled wiki, on a niche fast-moving domain
7%
vs 48% web search
hallucination rate — unsupported claims per answer
~3×
$0.0016 vs $0.0054 / query
cheaper than web search, and ~3× faster (~1.2s vs ~3.6s)
100%
vs 25% web search
on change-aware questions ("what changed in vX?") — freshness is a guaranteed win

Eval: 27 tasks on Hermes Agent (a ~2-month-old project releasing every 3–5 days), same model and token budget per condition, blind LLM judge, replicated across two model families.

The experiment

We gave the same model the same 27 questions about Hermes Agent under six conditions — from no context at all, to live web search, to retrieval over a compiled wiki. The judge never knows which condition produced an answer.

Condition% correctHallucination
Parametric (no context)4%85%
Web search (the status quo)48%48%
RAG over raw sources63%26%
RAG over the compiled wiki89%7%
Whole curated pages (static)81%22%
Agent navigates the wiki70%30%

Compiling is what does the work

The decisive comparison is row 3 vs row 4: same model, same retrieval, same token budget — the only difference is whether the agent retrieves raw source chunks or pages we compiled and curated. Accuracy jumps from 63% to 89%, and hallucinations fall from 26% to 7%.

That's the product. Not retrieval — maintenance. RAG searches your knowledge. Agent Wikis maintains it.

The wiki is a floor; the web is a coin flip

Wiki-backed conditions hold 75–89% across every domain and model we tested. Web search swings from 89% on well-documented topics down to 48% on niche ones — and re-running the same questions on a different day moves web results by ±10–14 points. Live search is not reproducible; a compiled wiki is deterministic.

On a niche domain, web search scored 0% on synthesis and decision questions — the kind agents actually get asked — while the wiki held 100%.

Honest bounds

We publish the losses too. On mature, exhaustively documented domains, web search wins simple lookups. And every wiki declares its scope — what it covers, what it doesn't, and how fresh it is — so an agent knows when to fall back to the web. That routing ("wiki first, web on gaps") scored 93%, above either source alone.

Built to be read by agents

Every wiki is plain Markdown with an llms.txt index, frontmatter metadata (sources, confidence, freshness, version), and stable URLs. Humans get this site; agents get the same content with zero scraping. Pro wikis add a hosted retrieval endpoint over MCP — the exact configuration that scored 89% above.

Browse the wikis