All Providers
9 Models
5 Families
125K Max Context
$0.01–$0.20 Input Cost/MTok
$0–$0.50 Output Cost/MTok
Capabilities
Reasoning
Tool Calling
Attachments
Open Weights
Setup
Set the following environment variable to use Inference:
INFERENCE_API_KEY Models (9)
| Model | Model ID | Input Cost | Output Cost | Context | Capabilities |
|---|---|---|---|---|---|
| Qwen 3 Embedding 4B qwen | qwen/qwen3-embedding-4b | $0.01/MTok | $0/MTok | 32K | Open |
| Llama 3.2 1B Instruct llama | meta/llama-3.2-1b-instruct | $0.01/MTok | $0.01/MTok | 16K | Tools Open |
| Llama 3.2 3B Instruct llama | meta/llama-3.2-3b-instruct | $0.02/MTok | $0.02/MTok | 16K | Tools Open |
| Llama 3.1 8B Instruct llama | meta/llama-3.1-8b-instruct | $0.03/MTok | $0.03/MTok | 16K | Tools Open |
| Mistral Nemo 12B Instruct mistral-nemo | mistral/mistral-nemo-12b-instruct | $0.04/MTok | $0.10/MTok | 16K | Tools Open |
| Llama 3.2 11B Vision Instruct llama | meta/llama-3.2-11b-vision-instruct | $0.06/MTok | $0.06/MTok | 16K | Tools Open |
| Osmosis Structure 0.6B osmosis | osmosis/osmosis-structure-0.6b | $0.10/MTok | $0.50/MTok | 4K | Tools Open |
| Google Gemma 3 gemma | google/gemma-3 | $0.15/MTok | $0.30/MTok | 125K | Tools Open |
| Qwen 2.5 7B Vision Instruct qwen | qwen/qwen-2.5-7b-vision-instruct | $0.20/MTok | $0.20/MTok | 125K | Tools Open |