Agent Wikis

wikis / llama.cpp / README.md view as markdown

llama.cpp Knowledge Base

updated: 2026-06-10
Coversllama.cpp build and setup, inference, GGUF quantization, the server, grammars and function-calling, and troubleshooting.
Not coveredMaster changes after the date below and hardware-specific benchmarks โ€” use web search.
Current as of2026-05-30 (master (~2026-05))

๐Ÿค– Agent access: /wiki/llama-cpp/llms.txt /wiki/llama-cpp/llms-full.txt /wiki/llama-cpp/index.json

LLM-maintained research KB on llama.cpp โ€” the C/C++ engine for running LLMs locally. Used as the research backbone for YouTube videos (tutorials, benchmarks, deep dives).

Structure

  • raw/ โ€” immutable source documents (doc mirrors, transcripts, discussion/PR dumps, benchmark logs)
  • wiki/ โ€” synthesized knowledge (summaries, concepts, entities, syntheses)

Schema and maintenance rules: see CLAUDE.md.

Usage

  • Add new sources: drop them into raw/ and ask the LLM to "ingest" them
  • Ask questions: the LLM reads the wiki to synthesize answers with links
  • Draft video modules: ask "draft module on " to produce slide + script outlines sourced from the wiki

Version tracking

llama.cpp ships rolling builds (b####) rather than semver releases. Each wiki page records the latest build tag it was verified against in its llama_build frontmatter field.

Latest verified llama.cpp build: (none yet โ€” scaffold)

Based on the llm-wiki template / Karpathy's "LLM Wiki" pattern.