Skip to main content

AI Models

General Requirements & Compliance
  • All models must comply with company data governance policies and security standards
  • Usage must adhere to vendor-specific terms of service and acceptable use policies
  • PII and sensitive data handling requires additional approval and encryption
  • Production deployments require security review and compliance sign-off
  • Cost monitoring and budget allocation must be configured before use
  • Models with specific legal requirements (shown in red below) require additional documentation
Vendor / Model Description Specifications When to Use Pros Cons
Google
Gemini 2.5 Pro
(v2.5)
via Google Cloud

Approved

Context: 1M tokens
Output: 8,192 tokens
Cutoff: Jan 2025
Speed: Moderate (3–6s)
Google
Gemini 2.5 Flash
(v2.5)
via Google Cloud

Approved

Context: 1M tokens
Output: 8,192 tokens
Cutoff: Jan 2025
Speed: Fast (1–3s)
Amazon Bedrock
Anthropic Claude
(3.5 Sonnet)
via AWS Bedrock

Approved

Context: 200K tokens
Output: 8,192 tokens
Cutoff: Apr 2024
Speed: Fast (2–4s)
Amazon Bedrock
Anthropic Claude
(3.5 Haiku)
via AWS Bedrock

Approved

Context: 200K tokens
Output: 8,192 tokens
Cutoff: Jul 2024
Speed: Very Fast (<1s)
Amazon Bedrock
Anthropic Claude
(3.7 Sonnet)
via AWS Bedrock

Approved

Context: 200K tokens
Output: 8,192 tokens
Cutoff: Oct 2024
Speed: Fast (2–4s)
Amazon Bedrock
Anthropic Claude
(Sonnet 4)
via AWS Bedrock

Approved

Context: 200K tokens
Output: 8,192 tokens
Cutoff: Mar 2025
Speed: Fast (2–4s)
Amazon Bedrock
Anthropic Claude
(Sonnet 4.5)
via AWS Bedrock

Approved

Context: 200K tokens
Output: 8,192 tokens
Cutoff: Jul 2024
Speed: Fast (2–4s)
Amazon Bedrock
Anthropic Claude
(Haiku 4.5)
via AWS Bedrock

Approved

Context: 200K tokens
Output: 8,192 tokens
Cutoff: Feb 2025
Speed: Very Fast (<1s)
Amazon Bedrock
Cohere Command
Command R/R+
via AWS Bedrock

Approved

Context: 128K tokens
Output: 4,096 tokens
Cutoff: Early 2024
Speed: Fast (2–4s)
Amazon Bedrock
Cohere Embed
(3)
via AWS Bedrock

Approved

Context: N/A
Output: Embeddings
Cutoff: Not applicable
Speed: Very Fast (<1s)
Amazon Bedrock
Cohere Embed
(4)
via AWS Bedrock

Approved

Context: N/A
Output: Embeddings
Cutoff: Not applicable
Speed: Very Fast (<1s)
Amazon Bedrock
Cohere Rerank
(3.5)
via AWS Bedrock

Approved

Context: N/A
Output: Rankings
Cutoff: Not applicable
Speed: Very Fast (<1s)
Amazon Bedrock
Llama
(2)
via AWS Bedrock

Approved

Context: 4K tokens
Output: 2,048 tokens
Cutoff: Sep 2022
Speed: Fast (2–4s)
Amazon Bedrock
Llama
(3)
via AWS Bedrock

Approved

Context: 8K tokens
Output: 4,096 tokens
Cutoff: Mar 2023
Speed: Fast (2–4s)
Amazon Bedrock
Mistral
(7B)
via AWS Bedrock

Approved

Context: 32K tokens
Output: 4,096 tokens
Cutoff: Early 2023
Speed: Very Fast (<2s)
Amazon Bedrock
Mistral
(8x7B)
via AWS Bedrock

Approved

Context: 32K tokens
Output: 4,096 tokens
Cutoff: Early 2024
Speed: Very Fast (<2s)
GCP
Mistral OCR
(25.05)
via GCP

Pending

Context: N/A
Output: Text extraction
Cutoff: Not applicable
Speed: Fast (1–3s)
Amazon Bedrock
OpenAI GPT OSS
(20B)
via AWS Bedrock

Approved

Context: 64K tokens
Output: 4,096 tokens
Cutoff: Jun 2024
Speed: Fast (2–4s)
Amazon Bedrock
OpenAI GPT OSS
(120B)
via AWS Bedrock

Approved

Context: 64K tokens
Output: 4,096 tokens
Cutoff: Jun 2024
Speed: Slower (5–10s)
Amazon Bedrock
DeepSeek
(R1)
via AWS Bedrock

Approved

Context: 64K tokens
Output: 4,096 tokens
Cutoff: Mid 2024
Speed: Moderate (3–5s)
GCP
Imagen
(4)
via AWS Bedrock

Approved

Context: N/A
Output: Images
Cutoff: Not applicable
Speed: Fast (3–5s)
Black Forest Labs
Flux
(.1 schnell)
via GCP

Requested

Context: N/A
Output: Images
Cutoff: Not applicable
Speed: Moderate (4–7s)
Krea
Flux
(.1 krea)
via Direct API

Requested

Context: N/A
Output: Images
Cutoff: Not applicable
Speed: Fast (3–5s)
GCP
Veo
(3, 3 Fast)
via GCP

Requested

Context: N/A
Output: Images
Cutoff: Not applicable
Speed: Moderate (4–7s)
Azure
GPT
(5)
via GCP

Requested

Context: N/A
Output: Video
Cutoff: Not applicable
Speed: Slow (30–60s)
Self-hosted
Qwen3-coder
(480B-A35B-Instruct)
via Self-hosted / Ollama

Approved

Context: 128K tokens
Output: 8,192 tokens
Cutoff: Mid 2024
Speed: Moderate (4–8s)
Self-hosted
Qwen3-coder
(30-A3B-Instruct)
via Self-hosted / Ollama

Approved

Context: 64K tokens
Output: 4,096 tokens
Cutoff: Mid 2024
Speed: Fast (2–4s)
Self-hosted
Codellama
(7b, 13b, 34b, 70b)
via Self-hosted / Ollama

Pending determination

Context: 16K tokens
Output: 4,096 tokens
Cutoff: Early 2023
Speed: Fast (2–5s)
Self-hosted
Codegemma
(2b, 7b)
via Self-hosted / Ollama

Approved

Context: 8K tokens
Output: 2,048 tokens
Cutoff: Early 2024
Speed: Very Fast (<2s)
Self-hosted
Codestral
(22b)
via Self-hosted / Ollama

Denied

Context: 32K tokens
Output: 4,096 tokens
Cutoff: Mid 2024
Speed: Moderate (3–6s)
Self-hosted
DeepSeek-coder
(v2)
via Self-hosted / Ollama

Approved

Context: 64K tokens
Output: 4,096 tokens
Cutoff: Mid 2024
Speed: Moderate (3–5s)
Self-hosted
Granite-code
via Self-hosted / Ollama

Requested

Context: 32K tokens
Output: 4,096 tokens
Cutoff: TBD
Speed: TBD
Self-hosted
Llama4Scout
via Self-hosted / Ollama

Requested

Context: 32K tokens
Output: 4,096 tokens
Cutoff: TBD
Speed: TBD
Self-hosted
Llama4Maverick
via Self-hosted / Ollama

Requested

Context: 32K tokens
Output: 4,096 tokens
Cutoff: TBD
Speed: TBD
Self-hosted
Llama3-gradient
via Self-hosted / Ollama

Requested

Context: 16K tokens
Output: 4,096 tokens
Cutoff: TBD
Speed: TBD
Self-hosted
Kimi-K2
via Self-hosted / Ollama

Requested

Context: 200K tokens
Output: 8,192 tokens
Cutoff: TBD
Speed: TBD
Self-hosted
GPT-OSS
(20B)
via Self-hosted / Ollama

Requested

Context: 64K tokens
Output: 4,096 tokens
Cutoff: TBD
Speed: TBD
Self-hosted
GPT-OSS
(120B)
via Self-hosted / Ollama

Requested

Context: 64K tokens
Output: 4,096 tokens
Cutoff: TBD
Speed: TBD
Azure
GPT-OSS-Safeguard
(120B)
via Azure

Requested

Context: 128K tokens
Output: 8,192 tokens
Cutoff: TBD
Speed: TBD
Azure
GPT-OSS-Safeguard
(20B)
via Azure

Requested

Context: 64K tokens
Output: 4,096 tokens
Cutoff: TBD
Speed: TBD
Self-hosted
BGE-BASE-EN
(1.5)
via Self-hosted

Approved

Context: N/A
Output: Embeddings
Cutoff: Not applicable
Speed: Very Fast (<1s)
Self-hosted
clip-ViT-B-32
(32)
via Self-hosted

Denied

Context: N/A
Output: Embeddings
Cutoff: Not applicable
Speed: Fast (1-3s)
Amazon Bedrock
Anthropic Claude
(Opus 4.5)
via AWS Bedrock

Approved

Context: 200K tokens
Output: 16,384 tokens
Cutoff: Aug 2024
Speed: Moderate (4-8s)
GCP
Gemini
(3)
via GCP

Approved

Context: 1M tokens
Output: 8,192 tokens
Cutoff: Feb 2025
Speed: Moderate (3-6s)
GCP
NanoBanana
(1)
via GCP

Requested

Context: 4K tokens
Output: 1,024 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Self-hosted
HunyuanOCR
via Self-hosted

Requested

Context: 8K tokens
Output: 2,048 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Self-hosted
SigLIP-base
(16-384)
via Self-hosted

Approved

Context: 4K tokens
Output: 1,024 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Self-hosted
all-MinniLM-L6-v2
(2)
via Self-hosted

Approved

Context: 8K tokens
Output: 2,048 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Self-hosted
Openai/whisper-base
(20250625)
via Self-hosted

Approved

Context: 4K tokens
Output: 1,024 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Self-hosted
Salesforce/blip-image
(16-Dec-25)
via Self-hosted

Approved

Context: 8K tokens
Output: 2,048 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Amazon Bedrock
Nova 2 Lite
via AWS Bedrock

Approved

Context: 4K tokens
Output: 1,024 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Amazon Bedrock
Nova 2 Pro
via AWS Bedrock

Approved

Context: 8K tokens
Output: 2,048 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Amazon Bedrock
Nova 2 Omni
via AWS Bedrock

Approved

Context: 4K tokens
Output: 1,024 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Amazon Bedrock
Nova 2 Sonic
via AWS Bedrock

Approved

Context: 8K tokens
Output: 2,048 tokens
Cutoff: TBD
Speed: Very Fast (<1s)
Amazon Bedrock
Nova Multimodal Embeddings
via AWS Bedrock

Approved

Context: 4K tokens
Output: 1,024 tokens
Cutoff: TBD
Speed: Very Fast (<1s)