---
title: "LoRA, QLoRA & Hyperparameters"
type: concept
tags: [lora, qlora, hyperparameters, rank, learning-rate]
updated: 2026-06-19
confidence: high
sources: [raw/llms_txt_doc-lora-fine-tuning-hyperparameters-guide.md, raw/llms_txt_doc-fine-tuning-llms-guide.md]
---

# LoRA, QLoRA & Hyperparameters

## LoRA and QLoRA

- **LoRA** (Low-Rank Adaptation) trains small adapter matrices injected into the model instead of all weights — tiny memory footprint, fast, and the adapter is a few MB you can swap/share.
- **QLoRA** = LoRA on a **4-bit quantized** base model — even less VRAM (the standard way to fine-tune big models on small GPUs), with Unsloth keeping accuracy high.

## Key hyperparameters

- **LoRA rank (`r`)** — adapter capacity; higher = more expressive but more memory/overfit risk. Common 8–64; 16/32 typical.
- **LoRA alpha** — scaling; a common heuristic is `alpha = r` or `2×r`.
- **target_modules** — which projections get adapters (attention q/k/v/o + MLP gate/up/down); targeting all linear layers is the strong default.
- **Learning rate** — e.g. ~2e-4 for LoRA (higher than full FT); too high destabilizes.
- **Epochs** — usually 1–3; more risks overfitting on small data.
- **Batch size & gradient accumulation** — *effective batch size* = `batch_size × grad_accum × #GPUs`; raise grad-accum to simulate a big batch within VRAM limits. It directly affects training stability.
- **Sequence length** — set to your data; Unsloth enables long context efficiently.

## Practical guidance

- Start from an Unsloth [notebook](../summaries/model-catalog-and-notebooks.md)'s defaults; change rank/LR/epochs only as needed.
- Watch eval loss for overfitting; reduce epochs/rank or add data if it diverges from train loss.
- QLoRA first (fits more); move to LoRA/full FT if you have VRAM and need maximum quality.

Adapters can be **hot-swapped** at inference (multiple LoRAs on one base — see [docs-catalog](../summaries/docs-catalog.md)). After training, [save/export](saving-and-exporting.md).
