# vLLM > Wiki for vLLM: installation, the OpenAI-compatible server, configuration, quantization, distributed serving, and per-model realities. > Covers: vLLM installation (CUDA/ROCm/CPU), offline LLM API and the OpenAI-compatible server, engine args and env vars, quantization methods, parallelism and scaling, CLI, metrics/ops, releases v0.19-v0.22, and tracker-sourced model notes (gpt-oss, Llama, Qwen). > Not covered: Kubernetes/Docker deployment guides, engine internals/design docs, contributing, benchmarking deep-dives, and releases after the date below - use web search. > Current as of: 2026-06-09 (v0.22.1) - [LLM Wiki](/raw/vllm/README.md) - [vLLM KB — Master Index](/raw/vllm/wiki/index.md) - [CLI Reference — vllm {serve,chat,complete,bench,run-batch}](/raw/vllm/wiki/concepts/cli-reference.md) - [Configuration — Engine Args, Env Vars, Memory](/raw/vllm/wiki/concepts/configuration.md) - [Installation (GPU / CPU / Platforms)](/raw/vllm/wiki/concepts/install.md) - [Integrations — Claude Code, Codex, LangChain, LlamaIndex](/raw/vllm/wiki/concepts/integrations-and-clients.md) - [Models & Support (incl. Transformers Backend)](/raw/vllm/wiki/concepts/models-and-support.md) - [Multimodal Inputs, LoRA & Prompt Embeddings](/raw/vllm/wiki/concepts/multimodal-and-lora.md) - [Observability & Ops — Metrics, Reproducibility, Usage Stats](/raw/vllm/wiki/concepts/observability-and-ops.md) - [OpenAI-Compatible Server](/raw/vllm/wiki/concepts/openai-compatible-server.md) - [Parallelism & Scaling (TP / PP / DP / EP / CP)](/raw/vllm/wiki/concepts/parallelism-and-scaling.md) - [Pooling Models — Embeddings, Classify, Score, Reward](/raw/vllm/wiki/concepts/pooling-models.md) - [Quantization — Methods & When to Use Which](/raw/vllm/wiki/concepts/quantization.md) - [Quickstart — Offline Inference & Online Serving](/raw/vllm/wiki/concepts/quickstart-and-serving.md) - [Activity Log](/raw/vllm/wiki/log.md) - [Release Digest — v0.19.0 → v0.22.1](/raw/vllm/wiki/summaries/release-digest.md) - [Model Notes from the Tracker — gpt-oss, Llama, Qwen, Gemma & Friends](/raw/vllm/wiki/syntheses/model-notes-from-the-tracker.md) - [Serving Decisions — Mode, Memory, Scale](/raw/vllm/wiki/syntheses/serving-decisions.md) - [Troubleshooting Playbook](/raw/vllm/wiki/syntheses/troubleshooting-playbook.md)