Live: Models
Open models forproduction agents
One OpenAI-compatible endpoint for reasoning, coding, multimodal, and low-latency workloads.20+ models serverless today. 50+ more as dedicated endpoints.
Showing 80 models
Serverless pricing is the live Token Factory rate card, per million tokens unless otherwise noted, same rates as the Token Factory catalog. Dedicated endpoints are billed per GPU-hour. See pricing or compare options in serverless vs dedicated.
Don't see a model you're interested in?