All Providers
31 Models
4 Families
1M Max Context
$0.01–$1.75 Input Cost/MTok
$0–$3.60 Output Cost/MTok
Capabilities
Reasoning
Tool Calling
Attachments
Open Weights
Setup
Set the following environment variable to use Nebius Token Factory:
NEBIUS_API_KEY Models (31)
| Model | Model ID | Input Cost | Output Cost | Context | Capabilities |
|---|---|---|---|---|---|
| Qwen3-Embedding-8B text-embedding | Qwen/Qwen3-Embedding-8B | $0.01/MTok | $0/MTok | 32.8K | Open |
| Meta-Llama-3.1-8B-Instruct | meta-llama/Meta-Llama-3.1-8B-Instruct | $0.02/MTok | $0.06/MTok | 128K | Tools Open |
| Gemma-2-2b-it | google/gemma-2-2b-it | $0.02/MTok | $0.06/MTok | 8.2K | Open |
| Nemotron-3-Nano-Omni | nvidia/Nemotron-3-Nano-Omni | $0.06/MTok | $0.24/MTok | 65.5K | Reasoning Tools Open |
| Nemotron-3-Nano-30B-A3B | nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B | $0.06/MTok | $0.24/MTok | 32K | Tools Open |
| Qwen3-30B-A3B-Instruct-2507 | Qwen/Qwen3-30B-A3B-Instruct-2507 | $0.10/MTok | $0.30/MTok | 128K | Tools Open |
| Qwen3-32B | Qwen/Qwen3-32B | $0.10/MTok | $0.30/MTok | 128K | Tools Open |
| Gemma-3-27b-it | google/gemma-3-27b-it | $0.10/MTok | $0.30/MTok | 110K | Tools Open |
| gpt-oss-120b-fast | openai/gpt-oss-120b-fast | $0.10/MTok | $0.50/MTok | 8K | Reasoning Tools Open |
| Hermes-4-70B | NousResearch/Hermes-4-70B | $0.13/MTok | $0.40/MTok | 128K | Reasoning Tools Open |
| Llama-3.3-70B-Instruct | meta-llama/Llama-3.3-70B-Instruct | $0.13/MTok | $0.40/MTok | 128K | Tools Open |
| Qwen3-Next-80B-A3B-Thinking-fast | Qwen/Qwen3-Next-80B-A3B-Thinking-fast | $0.15/MTok | $1.20/MTok | 8K | Reasoning Tools Open |
| Qwen3-Next-80B-A3B-Thinking | Qwen/Qwen3-Next-80B-A3B-Thinking | $0.15/MTok | $1.20/MTok | 128K | Reasoning Tools Open |
| gpt-oss-120b | openai/gpt-oss-120b | $0.15/MTok | $0.60/MTok | 128K | Reasoning Tools Open |
| INTELLECT-3 | PrimeIntellect/INTELLECT-3 | $0.20/MTok | $1.10/MTok | 128K | Tools Open |
| Qwen3 235B A22B Instruct 2507 qwen | Qwen/Qwen3-235B-A22B-Instruct-2507 | $0.20/MTok | $0.60/MTok | 262.1K | Reasoning Tools |
| Qwen2.5-VL-72B-Instruct | Qwen/Qwen2.5-VL-72B-Instruct | $0.25/MTok | $0.75/MTok | 128K | Tools Open |
| MiniMax-M2.5 | MiniMaxAI/MiniMax-M2.5 | $0.30/MTok | $1.20/MTok | 196.6K | Reasoning Tools Open |
| MiniMax-M2.5-fast | MiniMaxAI/MiniMax-M2.5-fast | $0.30/MTok | $1.20/MTok | 8K | Reasoning Tools Open |
| DeepSeek-V3.2 | deepseek-ai/DeepSeek-V3.2 | $0.30/MTok | $0.45/MTok | 163K | Reasoning Tools Open |
| Nemotron-3-Super-120B-A12B | nvidia/nemotron-3-super-120b-a12b | $0.30/MTok | $0.90/MTok | 256K | Reasoning Tools Open |
| DeepSeek-V3.2-fast | deepseek-ai/DeepSeek-V3.2-fast | $0.40/MTok | $2/MTok | 8K | Reasoning Tools Open |
| Kimi-K2.5 kimi | moonshotai/Kimi-K2.5 | $0.50/MTok | $2.50/MTok | 256K | Reasoning Tools Open |
| Kimi-K2.5-fast kimi | moonshotai/Kimi-K2.5-fast | $0.50/MTok | $2.50/MTok | 256K | Reasoning Tools Open |
| Qwen3-235B-A22B-Thinking-2507-fast | Qwen/Qwen3-235B-A22B-Thinking-2507-fast | $0.50/MTok | $2/MTok | 8K | Reasoning Tools Open |
| Llama-3.1-Nemotron-Ultra-253B-v1 | nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 | $0.60/MTok | $1.80/MTok | 128K | Tools Open |
| Qwen3.5-397B-A17B | Qwen/Qwen3.5-397B-A17B | $0.60/MTok | $3.60/MTok | 262.1K | Reasoning Tools Open |
| Qwen3.5-397B-A17B-fast | Qwen/Qwen3.5-397B-A17B-fast | $0.60/MTok | $3.60/MTok | 8K | Reasoning Tools Open |
| GLM-5 | zai-org/GLM-5 | $1/MTok | $3.20/MTok | 200K | Reasoning Tools |
| Hermes-4-405B | NousResearch/Hermes-4-405B | $1/MTok | $3/MTok | 128K | Reasoning Tools Open |
| DeepSeek V4 Pro deepseek-thinking | deepseek-ai/DeepSeek-V4-Pro | $1.75/MTok | $3.50/MTok | 1M | Reasoning Tools Open |