All Providers
9 Models
5 Families
125K Max Context
$0.01–$0.20 Input Cost/MTok
$0–$0.50 Output Cost/MTok

Capabilities

Reasoning
Tool Calling
Attachments
Open Weights

Setup

Set the following environment variable to use Inference:

INFERENCE_API_KEY

Models (9)

Model Model ID Input Cost Output Cost Context Capabilities
Qwen 3 Embedding 4B qwen qwen/qwen3-embedding-4b $0.01/MTok $0/MTok 32K Open
Llama 3.2 1B Instruct llama meta/llama-3.2-1b-instruct $0.01/MTok $0.01/MTok 16K Tools Open
Llama 3.2 3B Instruct llama meta/llama-3.2-3b-instruct $0.02/MTok $0.02/MTok 16K Tools Open
Llama 3.1 8B Instruct llama meta/llama-3.1-8b-instruct $0.03/MTok $0.03/MTok 16K Tools Open
Mistral Nemo 12B Instruct mistral-nemo mistral/mistral-nemo-12b-instruct $0.04/MTok $0.10/MTok 16K Tools Open
Llama 3.2 11B Vision Instruct llama meta/llama-3.2-11b-vision-instruct $0.06/MTok $0.06/MTok 16K Tools Files Open
Osmosis Structure 0.6B osmosis osmosis/osmosis-structure-0.6b $0.10/MTok $0.50/MTok 4K Tools Open
Google Gemma 3 gemma google/gemma-3 $0.15/MTok $0.30/MTok 125K Tools Files Open
Qwen 2.5 7B Vision Instruct qwen qwen/qwen-2.5-7b-vision-instruct $0.20/MTok $0.20/MTok 125K Tools Files Open