All Models
Qwen3 VL 8B Instruct
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...
Benchmarks
Available Providers (3)
| Provider | Model ID | Input Cost | Output Cost | Context | Max Output | Docs |
|---|---|---|---|---|---|---|
| | qwen3-vl-8b-instruct | $0.08/MTok | $0.50/MTok | 131.1K | 8.2K | |
| | qwen/qwen3-vl-8b-instruct | $0.08/MTok | $0.50/MTok | 131.1K | 32.8K | |
| | qwen/qwen3-vl-8b-instruct | $0.08/MTok | $0.50/MTok | 131.1K | 32.8K |
Capabilities
Reasoning
Tool Calling
Attachments
Open Weights
Structured Output