Gemma 3 27b Vision Test

What Do You See in this Image?

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.
ollama.com

Testing Gemma3 27b

The simplest way to test Gemma 3 locally is with either Ollama or LMStudio. I like to use Open WebUI with Ollama but for this example I used LMStudio. To test the model I used the above image and a simple initial prompt then a follow up prompt.

Gemma 3 Vision Prompts

Question One

Explain everything you see how many people are in the photo? How many are Male and how many are Female? What is the approximate time of day? What season is it? Explain your reasoning?

Follow Up:

Given the information can you give a more precise location?

This is just a quick initial test but so far I am very impressed with the vision capabilities of the Gemma 3 model.

Gemma 3 27b Vision Test

What Do You See in this Image?

Testing Gemma3 27b

Gemma 3 Vision Prompts

Leave a Reply Cancel reply