All Models

Molmo 2 8B

allenai Attachments

Molmo2-8B is an open vision-language model developed by the Allen Institute for AI (Ai2) as part of the Molmo2 family, supporting image, video, and multi-image understanding and grounding. It is based on Qwen3-8B and uses SigLIP 2 as its vision backbone, outperforming other open-weight, open-data models on short videos, counting, and captioning, while remaining competitive on long-video tasks.

Providers 1
Released Feb 14, 2026
Input Modalities text, image
Output Modalities text

Available Providers (1)

Provider Model ID Input Cost Output Cost Context Max Output Docs
NanoGPT allenai/molmo-2-8b $0.20/MTok $0.20/MTok 36.9K 36.9K

Capabilities

Reasoning
Tool Calling
Attachments
Open Weights
Structured Output