---
title: "Google Vertex AI — Gemini Provider"
type: entity
tags: [provider, gateway, gemini, oauth, config, well-established, intermediate]
created: 2026-07-01
updated: 2026-07-01
sources: ["raw/docs-guides-google-vertex.md", "raw/release-v0.18.0.md"]
confidence: high
hermes_version: "v0.18.0"
---

## Overview

The **Google Vertex AI provider** gives Hermes access to **Gemini models on Google Cloud Vertex AI**, routed over Vertex's **OpenAI-compatible endpoint** — not the native Vertex SDK/protocol. It shipped as a first-class provider in [[entities/version-v0.18.0]] (salvaging and modernizing an earlier community PR, #8427). Provider id: `vertex`. Unlike [[entities/provider-google-ai-studio]] (static `GOOGLE_API_KEY` against `generativelanguage.googleapis.com`), Vertex has **no static API key** for the standard endpoint: Hermes authenticates with OAuth2, minting and auto-refreshing short-lived access tokens. Vertex is the right choice when you want Gemini usage to draw on enterprise-grade rate limits and GCP billing/credits rather than an AI Studio key.

## Characteristics

- **Provider id:** `vertex`
- **No static API key** — every request needs a short-lived OAuth2 access token (≈1 hour TTL, `cloud-platform` scope). Hermes mints and auto-refreshes these tokens automatically; pasting a temporary token into a custom provider's `api_key` field does not work because it expires mid-session.
- **Credential resolution order:** `VERTEX_CREDENTIALS_PATH` → `GOOGLE_APPLICATION_CREDENTIALS` → Application Default Credentials (ADC). Omit both env vars to fall back to ADC.
- **Prerequisites:** a Google Cloud project with the Vertex AI API enabled and billing active; either a service-account JSON key file with the `roles/aiplatform.user` role, or ADC via `gcloud auth application-default login` (or the metadata server on a GCP VM).
- **`google-auth` dependency** — installed automatically the first time you select Vertex (lazy install), or explicitly with `pip install 'hermes-agent[vertex]'`.
- **Token refresh mechanics:** Hermes caches the minted token and refreshes it when it's within 5 minutes of expiry. If a session outlives the token and a request returns `401`, Hermes re-mints the token and retries automatically. On a long-running gateway, if ADC's refresh token has itself expired, Hermes falls back to the service-account JSON when one is configured.
- **Endpoint:** the token is handed to a standard OpenAI client pointed at `https://aiplatform.googleapis.com/v1beta1/projects/{project}/locations/{region}/endpoints/openapi`. Regional locations use a `{region}-aiplatform.googleapis.com` host instead.
- **Model IDs require the `google/` vendor prefix.** Available models: `google/gemini-3.1-pro-preview`, `google/gemini-3-pro-preview`, `google/gemini-3-flash-preview`, `google/gemini-3.1-flash-lite-preview`, `google/gemini-2.5-pro`, `google/gemini-2.5-flash`.
- **`global` region required for Gemini 3.x previews** — the Gemini 3.x preview models are served through the `global` endpoint; regional endpoints (e.g. `us-central1`) may 404 them.
- **Config split by sensitivity:** the credential path is a secret pointer and lives in `~/.hermes/.env`; project ID and region are non-secret routing settings and live in `~/.hermes/config.yaml`. `VERTEX_PROJECT_ID` and `VERTEX_REGION` env vars override `vertex.project_id` / `vertex.region` in `config.yaml` for per-shell overrides.
- **Reasoning/thinking support** — Vertex exposes Gemini's thinking budget through the OpenAI-compatible surface; Hermes maps its `reasoning_effort` setting onto `extra_body.google.thinking_config` automatically, so `reasoning_effort` works the same as on other Gemini surfaces.
- **Diagnostics:** `hermes doctor` reports whether Vertex credentials can be resolved (service-account path or ADC) and whether the provider is configured.
- Released alongside a v0.18.0 provider cleanup that removed the `google-gemini-cli` and `google-antigravity` OAuth providers.

## How to Use

```bash
# Option A — service account JSON (recommended for servers / gateways)
echo "VERTEX_CREDENTIALS_PATH=/path/to/service-account.json" >> ~/.hermes/.env

# Option B — Application Default Credentials (good for local dev)
gcloud auth application-default login

# Select Vertex as your provider
hermes model
# → Choose "More providers..." → "Google Vertex AI"
# → Enter your GCP project ID (or leave blank to use the one in your credentials)
# → Choose a region (default: global)
# → Select a Gemini model

# Start chatting
hermes chat
```

`~/.hermes/.env` — credential path (one of these; checked in this order, omit both to use ADC):

```bash
VERTEX_CREDENTIALS_PATH=/path/to/service-account.json
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
```

`~/.hermes/config.yaml` — non-secret routing settings:

```yaml
model:
  default: google/gemini-3-flash-preview
  provider: vertex
vertex:
  project_id: my-gcp-project   # blank → use the project embedded in the credentials
  region: global               # "global" is required for the Gemini 3.x previews
```

Switch models mid-session (does not collect new credentials — configure Vertex with `hermes model` first):

```text
/model google/gemini-3-pro-preview
/model google/gemini-3-flash-preview
```

**Troubleshooting:**
- *"Vertex AI credentials could not be resolved"* — set `VERTEX_CREDENTIALS_PATH` in `~/.hermes/.env`, or run `gcloud auth application-default login`. If the project isn't embedded in the credentials, set `vertex.project_id` in `config.yaml`.
- *`google-auth` not installed* — `pip install 'hermes-agent[vertex]'` (also lazy-installed on first Vertex selection).
- *404 on Gemini 3.x models* — you're probably on a regional endpoint; set `region: global` in the `vertex:` section of `config.yaml` (or unset `VERTEX_REGION`).
- *403 / permission denied* — the service account (or ADC identity) needs the `roles/aiplatform.user` role on the project, and the Vertex AI API must be enabled for that project.

## Related Entities

- [[entities/provider-google-ai-studio]] — sibling Gemini provider; AI Studio uses a static `GOOGLE_API_KEY` against `generativelanguage.googleapis.com`, while Vertex uses OAuth2-minted, auto-refreshed access tokens against Vertex's OpenAI-compatible endpoint, drawing on GCP billing/credits instead of an AI Studio key
- [[entities/version-v0.18.0]] — release that shipped Google Vertex AI as a first-class provider
- [[entities/provider-openrouter]] — sibling aggregator path to Gemini and other frontier models
- [[concepts/model-switching]] — the `hermes model` picker and `/model` mid-session switching used to configure and select Vertex
- [[concepts/configuration-reference]] — `~/.hermes/.env` vs `~/.hermes/config.yaml` split for secrets vs. routing settings
