---
title: "transformers (Python library)"
type: entity
tags: [transformers, pipeline, trainer, autoclass, models]
updated: 2026-06-23
confidence: high
sources: [raw/github_doc-docs-source-en-index-md-2.md, raw/github_doc-docs-source-en-installation-md-2.md, raw/github_doc-docs-source-en-quicktour-md.md]
---

# transformers (Python library)

`transformers` is the model-definition framework for state-of-the-art ML across text, vision, audio, video, and multimodal — inference and training. It centralizes model definitions so a supported model works across the ecosystem: training frameworks (Axolotl, Unsloth, DeepSpeed, FSDP, ...), inference engines (vLLM, SGLang, TGI, ...), and adjacent libraries (llama.cpp, mlx, ...). 1M+ checkpoints on the Hub.

## Install

Tested on Python 3.10+ and PyTorch 2.4+. `pip install transformers` (or `uv pip install transformers`). Verify:

```bash
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('hugging face is the best'))"
# [{'label': 'POSITIVE', 'score': 0.9998704791069031}]
```

## Three base classes

Every pretrained model is built from three classes:

- `PreTrainedConfig` — model attributes (attention heads, vocab size, ...).
- `PreTrainedModel` — the architecture (e.g. `LlamaModel` vs `LlamaForCausalLM`).
- Preprocessor — converts raw inputs to tensors (e.g. `PreTrainedTokenizer`, `ImageProcessingMixin`).

The **AutoClass** API (`AutoModelFor*`, `AutoTokenizer`) infers the architecture from the model name and loads it with `from_pretrained`. See [[concepts/transformers-basics]].

## Two APIs: Pipeline and Trainer

**Pipeline** — simple, optimized inference for many tasks (text generation, image segmentation, ASR, document QA, ...). **Trainer** — a complete PyTorch training/eval loop (mixed precision, torch.compile, FlashAttention, distributed). Provide a model, dataset, preprocessor, and data collator; configure with `TrainingArguments`.

```py
from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="distilbert-rotten-tomatoes",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    num_train_epochs=2,
    push_to_hub=True,
)
trainer = Trainer(model=model, args=training_args, train_dataset=dataset["train"], ...)
trainer.train()
trainer.push_to_hub()
```

For LLM/VLM text generation (`generate`, streaming, decoding strategies) see [[concepts/running-llms-with-transformers]].

Related: [[concepts/transformers-basics]], [[entities/huggingface-hub-library]], [[concepts/datasets-basics]], [[syntheses/choosing-local-vs-inference]].