home / why-wikis
Why wikis?
Today an agent answers from two places: what it memorized in training, and what it finds on the live web. On niche, fast-moving domains, both fail — parametric memory knows nothing, and web search hallucinates half the time. We measured it.
Eval: 27 tasks on Hermes Agent (a ~2-month-old project releasing every 3–5 days), same model and token budget per condition, blind LLM judge, replicated across two model families.
The experiment
We gave the same model the same 27 questions about Hermes Agent under six conditions — from no context at all, to live web search, to retrieval over a compiled wiki. The judge never knows which condition produced an answer.
| Condition | % correct | Hallucination |
|---|---|---|
| Parametric (no context) | 4% | 85% |
| Web search (the status quo) | 48% | 48% |
| RAG over raw sources | 63% | 26% |
| RAG over the compiled wiki | 89% | 7% |
| Whole curated pages (static) | 81% | 22% |
| Agent navigates the wiki | 70% | 30% |
Compiling is what does the work
The decisive comparison is row 3 vs row 4: same model, same retrieval, same token budget — the only difference is whether the agent retrieves raw source chunks or pages we compiled and curated. Accuracy jumps from 63% to 89%, and hallucinations fall from 26% to 7%.
That's the product. Not retrieval — maintenance. RAG searches your knowledge. Agent Wikis maintains it.
The wiki is a floor; the web is a coin flip
Wiki-backed conditions hold 75–89% across every domain and model we tested. Web search swings from 89% on well-documented topics down to 48% on niche ones — and re-running the same questions on a different day moves web results by ±10–14 points. Live search is not reproducible; a compiled wiki is deterministic.
On a niche domain, web search scored 0% on synthesis and decision questions — the kind agents actually get asked — while the wiki held 100%.
Honest bounds
We publish the losses too. On mature, exhaustively documented domains, web search wins simple lookups. And every wiki declares its scope — what it covers, what it doesn't, and how fresh it is — so an agent knows when to fall back to the web. That routing ("wiki first, web on gaps") scored 93%, above either source alone.
Built to be read by agents
Every wiki is plain Markdown with an llms.txt index, frontmatter metadata (sources, confidence, freshness, version), and stable URLs. Humans get this site; agents get the same content with zero scraping. Pro wikis add a hosted retrieval endpoint over MCP — the exact configuration that scored 89% above.
