Start today.Scale on the same account
Start with OpenAI-compatible inference, keep your agent stack portable, and move into dedicated GPUs when traffic proves out.
Built for AI-native startups shipping agents and multimodal products.
Start building in minutes
Point the OpenAI SDK at FlexAI, swap the model, and ship. Free credit covers your first calls.
curl https://tokens.flex.ai/v1/chat/completions \
-H "Authorization: Bearer $FLEXAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Meta-Llama-3.1-8B-Instruct-FP8",
"messages": [{"role": "user", "content": "Hello from FlexAI"}]
}'Built for what you're shipping
The agents startups ship most, each a multi-model pipeline behind one key.
Coding agents
Agents that generate, review, and repair code across your repo.
- GenerateQwen3 Coder 30B A3B
- Review & fast editsGPT-OSS 20B
Start with Qwen3 Coder 30B A3B from $0.063/M
See the pipelineSupport agents
Agents that triage, answer, and resolve customer conversations.
- TranscribeWhisper Large V3 Turbo
- Read attachmentsPaddleOCR-VL 1.5
- RetrieveBGE-M3
- RespondGPT-OSS 20B
- SpeakKokoro-82M
Start with GPT-OSS 20B from $0.027/M
See the pipelineResearch agents
Agents that retrieve, reason over, and synthesize large source sets.
- RetrieveBGE-M3
- ReasonDeepSeek V3.2
- SummarizeDeepSeek V4 Flash
Start with DeepSeek V3.2 from $0.225/M
See the pipelineWhy startups start on FlexAI
Ship on day one, on a platform that stays cheap as you scale.
Serverless, per-token
Start instantly: pay only for the tokens you use, with no provisioning and no minimums.
Performance
Streaming on every model, with low-latency serving tuned per model. See live per-model performance.
Competitive cost
Competitive per-token pricing on every model, verifiable against the public source. See pricing.
Already shipping agents? Bring them to the Agent SDK (in trial): portable skills and multi-model routing on the same key.
The FlexAI Startup Program
Apply-only, one program, three stages. $1,000 in Token Factory credit to start, 50% off your first 3 months and 30% off the next 3, then up to $20K of dedicated savings as you grow. You're never asked to re-apply or migrate.
Token Factory entry
- $1,000 in Token Factory credit: our serverless per-token tier
- Then 50% off your first 3 months, 30% off the next 3
- Competitive pricing for the life of your account
Dedicated compute
- As your workload proves out, you qualify for dedicated GPU-hours
- 50% off published dedicated and managed fine-tuning prices for 6 months, up to $20K of savings
- One account and API as you scale
Committed use
- Lock in committed-use pricing when you're ready
- A designed continuation, not a second cliff
- No minimum spend
Apply-only. We look for AI-native teams with a real, continuous workload:
- $1M+ venture- or accelerator-backed, or a validated production workload
- AI-native product: inference is central, not a side feature
- At least one ML/AI engineer on staff
- A continuous inference or agent workload in production, not project-based
Loading meeting scheduler…
Not there yet? Explore Token Factory and apply when your workload is ready. Reviewed within 10 business days.
One account, the whole way
No re-platform when your workload grows.
Frequently Asked Questions
Ready to get started?
Book an intro call to join the Startup Program: $1,000 in Token Factory credit, stacked discounts, and free monthly credits.