Meta: Llama 3.2 11B Vision Instruct

Attachments Open Weights

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...

Providers 1

Released Sep 25, 2024

Input Modalities text, image

Output Modalities text

Available Providers (1)

Provider	Model ID	Input Cost	Output Cost	Context	Max Output	Docs
Kilo Gateway	`meta-llama/llama-3.2-11b-vision-instruct`	$0.05/MTok	$0.05/MTok	131.1K	16.4K

Capabilities

Reasoning

Tool Calling

Attachments

Open Weights

Structured Output