wikis / Hermes / wiki / entities / provider-google-vertex.md view as markdown report a mistake
Overview
The Google Vertex AI provider gives Hermes access to Gemini models on Google Cloud Vertex AI, routed over Vertex's OpenAI-compatible endpoint — not the native Vertex SDK/protocol. It shipped as a first-class provider in version v0.18.0 (salvaging and modernizing an earlier community PR, #8427). Provider id: vertex. Unlike provider google ai studio (static GOOGLE_API_KEY against generativelanguage.googleapis.com), Vertex has no static API key for the standard endpoint: Hermes authenticates with OAuth2, minting and auto-refreshing short-lived access tokens. Vertex is the right choice when you want Gemini usage to draw on enterprise-grade rate limits and GCP billing/credits rather than an AI Studio key.
Characteristics
- Provider id:
vertex - No static API key — every request needs a short-lived OAuth2 access token (≈1 hour TTL,
cloud-platformscope). Hermes mints and auto-refreshes these tokens automatically; pasting a temporary token into a custom provider'sapi_keyfield does not work because it expires mid-session. - Credential resolution order:
VERTEX_CREDENTIALS_PATH→GOOGLE_APPLICATION_CREDENTIALS→ Application Default Credentials (ADC). Omit both env vars to fall back to ADC. - Prerequisites: a Google Cloud project with the Vertex AI API enabled and billing active; either a service-account JSON key file with the
roles/aiplatform.userrole, or ADC viagcloud auth application-default login(or the metadata server on a GCP VM). google-authdependency — installed automatically the first time you select Vertex (lazy install), or explicitly withpip install 'hermes-agent[vertex]'.- Token refresh mechanics: Hermes caches the minted token and refreshes it when it's within 5 minutes of expiry. If a session outlives the token and a request returns
401, Hermes re-mints the token and retries automatically. On a long-running gateway, if ADC's refresh token has itself expired, Hermes falls back to the service-account JSON when one is configured. - Endpoint: the token is handed to a standard OpenAI client pointed at
https://aiplatform.googleapis.com/v1beta1/projects/{project}/locations/{region}/endpoints/openapi. Regional locations use a{region}-aiplatform.googleapis.comhost instead. - Model IDs require the
google/vendor prefix. Available models:google/gemini-3.1-pro-preview,google/gemini-3-pro-preview,google/gemini-3-flash-preview,google/gemini-3.1-flash-lite-preview,google/gemini-2.5-pro,google/gemini-2.5-flash. globalregion required for Gemini 3.x previews — the Gemini 3.x preview models are served through theglobalendpoint; regional endpoints (e.g.us-central1) may 404 them.- Config split by sensitivity: the credential path is a secret pointer and lives in
~/.hermes/.env; project ID and region are non-secret routing settings and live in~/.hermes/config.yaml.VERTEX_PROJECT_IDandVERTEX_REGIONenv vars overridevertex.project_id/vertex.regioninconfig.yamlfor per-shell overrides. - Reasoning/thinking support — Vertex exposes Gemini's thinking budget through the OpenAI-compatible surface; Hermes maps its
reasoning_effortsetting ontoextra_body.google.thinking_configautomatically, soreasoning_effortworks the same as on other Gemini surfaces. - Diagnostics:
hermes doctorreports whether Vertex credentials can be resolved (service-account path or ADC) and whether the provider is configured. - Released alongside a v0.18.0 provider cleanup that removed the
google-gemini-cliandgoogle-antigravityOAuth providers.
How to Use
# Option A — service account JSON (recommended for servers / gateways)
echo "VERTEX_CREDENTIALS_PATH=/path/to/service-account.json" >> ~/.hermes/.env
# Option B — Application Default Credentials (good for local dev)
gcloud auth application-default login
# Select Vertex as your provider
hermes model
# → Choose "More providers..." → "Google Vertex AI"
# → Enter your GCP project ID (or leave blank to use the one in your credentials)
# → Choose a region (default: global)
# → Select a Gemini model
# Start chatting
hermes chat
~/.hermes/.env — credential path (one of these; checked in this order, omit both to use ADC):
VERTEX_CREDENTIALS_PATH=/path/to/service-account.json
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
~/.hermes/config.yaml — non-secret routing settings:
model:
default: google/gemini-3-flash-preview
provider: vertex
vertex:
project_id: my-gcp-project # blank → use the project embedded in the credentials
region: global # "global" is required for the Gemini 3.x previews
Switch models mid-session (does not collect new credentials — configure Vertex with hermes model first):
/model google/gemini-3-pro-preview
/model google/gemini-3-flash-preview
Troubleshooting:
- "Vertex AI credentials could not be resolved" — set
VERTEX_CREDENTIALS_PATHin~/.hermes/.env, or rungcloud auth application-default login. If the project isn't embedded in the credentials, setvertex.project_idinconfig.yaml. google-authnot installed —pip install 'hermes-agent[vertex]'(also lazy-installed on first Vertex selection).- 404 on Gemini 3.x models — you're probably on a regional endpoint; set
region: globalin thevertex:section ofconfig.yaml(or unsetVERTEX_REGION). - 403 / permission denied — the service account (or ADC identity) needs the
roles/aiplatform.userrole on the project, and the Vertex AI API must be enabled for that project.
Related Entities
- provider google ai studio — sibling Gemini provider; AI Studio uses a static
GOOGLE_API_KEYagainstgenerativelanguage.googleapis.com, while Vertex uses OAuth2-minted, auto-refreshed access tokens against Vertex's OpenAI-compatible endpoint, drawing on GCP billing/credits instead of an AI Studio key - version v0.18.0 — release that shipped Google Vertex AI as a first-class provider
- provider openrouter — sibling aggregator path to Gemini and other frontier models
- model switching — the
hermes modelpicker and/modelmid-session switching used to configure and select Vertex - configuration reference —
~/.hermes/.envvs~/.hermes/config.yamlsplit for secrets vs. routing settings
