wikis / llama.cpp / README.md view as markdown
llama.cpp Knowledge Base
Coversllama.cpp build and setup, inference, GGUF quantization, the server, grammars and function-calling, and troubleshooting.
Not coveredMaster changes after the date below and hardware-specific benchmarks โ use web search.
Current as of2026-05-30 (master (~2026-05))
๐ค Agent access: /wiki/llama-cpp/llms.txt /wiki/llama-cpp/llms-full.txt /wiki/llama-cpp/index.json
LLM-maintained research KB on llama.cpp โ the C/C++ engine for running LLMs locally. Used as the research backbone for YouTube videos (tutorials, benchmarks, deep dives).
Structure
raw/โ immutable source documents (doc mirrors, transcripts, discussion/PR dumps, benchmark logs)wiki/โ synthesized knowledge (summaries, concepts, entities, syntheses)
Schema and maintenance rules: see CLAUDE.md.
Usage
- Add new sources: drop them into
raw/and ask the LLM to "ingest" them - Ask questions: the LLM reads the wiki to synthesize answers with links
- Draft video modules: ask "draft module on " to produce slide + script outlines sourced from the wiki
Version tracking
llama.cpp ships rolling builds (b####) rather than semver releases. Each wiki page records the latest build tag it was verified against in its llama_build frontmatter field.
Latest verified llama.cpp build: (none yet โ scaffold)
Based on the llm-wiki template / Karpathy's "LLM Wiki" pattern.
