Google: Gemini Pro Vision 1.0

Gemini 2.5 Pro Preview

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

Gemini 2.5 Pro Experimental

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

Gemma 3 1B

Gemma 3 1B is the smallest of the new Gemma 3 family. It handles context windows up to 32k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Note: Gemma 3 1B is not multimodal. For the smallest multimodal Gemma 3 model, please see Gemma 3 4B

Gemma 3 4B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling.

Gemma 3 12B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 12B is the second largest in the family of Gemma 3 models after Gemma 3 27B

Gemma 3 27B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to Gemma 2

Gemini 2.0 Flash Lite

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5, all at extremely economical token prices.

Gemini 2.0 Flash

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5. It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.

Gemini 2.0 Flash Thinking Experimental 01-21

Gemini 2.0 Flash Thinking Experimental (01-21) is a snapshot of Gemini 2.0 Flash Thinking Experimental.

Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stronger reasoning capabilities in its responses than the base Gemini 2.0 Flash model.

Gemini 2.0 Flash Thinking Experimental

Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stronger reasoning capabilities in its responses than the base Gemini 2.0 Flash model.

Gemini 2.0 Flash Experimental

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5. It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.

Gemini Experimental 1121

Experimental release (November 21st, 2024) of Gemini.

LearnLM 1.5 Pro Experimental

An experimental version of Gemini 1.5 Pro from Google.

Gemini Experimental 1114

Gemini 11-14 (2024) experimental model features "quality" improvements.

Gemini 1.5 Flash 8B

Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.

Click here to learn more about this model.

Usage of Gemini is subject to Google's Gemini Terms of Use.

Gemini 1.5 Flash 8B Experimental

Gemini Flash 1.5 8B Experimental is an experimental, 8B parameter version of the Gemini Flash 1.5 model.

Usage of Gemini is subject to Google's Gemini Terms of Use.

#multimodal

Note: This model is currently experimental and not suitable for production use-cases, and may be heavily rate-limited.

Gemini 1.5 Flash Experimental

Gemini 1.5 Flash Experimental is an experimental version of the Gemini 1.5 Flash model.

Usage of Gemini is subject to Google's Gemini Terms of Use.

#multimodal

Note: This model is experimental and not suited for production use-cases. It may be removed or redirected to another model in the future.

Gemini 1.5 Pro Experimental

Gemini 1.5 Pro Experimental is a bleeding-edge version of the Gemini 1.5 Pro model. Because it's currently experimental, it will be heavily rate-limited by Google.

Usage of Gemini is subject to Google's Gemini Terms of Use.

#multimodal

Gemma 2 27B

Gemma 2 27B by Google is an open model built from the same research and technology used to create the Gemini models.

Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning.

See the launch announcement for more details. Usage of Gemma is subject to Google's Gemma Terms of Use.

Gemma 2 9B

Gemma 2 9B by Google is an advanced, open-source language model that sets a new standard for efficiency and performance in its size class.

Designed for a wide variety of tasks, it empowers developers and researchers to build innovative applications, while maintaining accessibility, safety, and cost-effectiveness.

See the launch announcement for more details. Usage of Gemma is subject to Google's Gemma Terms of Use.

Gemini 1.5 Flash

Gemini 1.5 Flash is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots.

Gemini 1.5 Flash is designed for high-volume, high-frequency tasks where cost and latency matter. On most common tasks, Flash achieves comparable quality to other Gemini Pro models at a significantly reduced cost. Flash is well-suited for applications like chat assistants and on-demand content generation where speed and scale matter.

Usage of Gemini is subject to Google's Gemini Terms of Use.

#multimodal

Gemini 1.5 Pro

Google's latest multimodal model, supports image and video[0] in text or chat prompts.

Optimized for language tasks including:

Usage of Gemini is subject to Google's Gemini Terms of Use.

Gemma 7B

Gemma by Google is an advanced, open-source language model family, leveraging the latest in decoder-only, text-to-text technology. It offers English language capabilities across text generation tasks like question answering, summarization, and reasoning. The Gemma 7B variant is comparable in performance to leading open source models.

Usage of Gemma is subject to Google's Gemma Terms of Use.

Gemini Pro 1.0

Google's flagship text generation model. Designed to handle natural language tasks, multiturn text and code chat, and code generation.

See the benchmarks and prompting guidelines from Deepmind.

Usage of Gemini is subject to Google's Gemini Terms of Use.

PaLM 2 Chat 32k

PaLM 2 is a language model by Google with improved multilingual, reasoning and coding capabilities.

PaLM 2 Code Chat 32k

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions.

PaLM 2 Chat

PaLM 2 is a language model by Google with improved multilingual, reasoning and coding capabilities.

PaLM 2 Code Chat

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions.

google/gemini-pro-vision

Providers for Gemini Pro Vision 1.0

OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

More models from Google