Gemini 3.5 Flash

google/gemini-3.5-flash
by google · 2026-05-23

Google's efficient multimodal model with 1M context, high output, and cost-effective pricing via OrcaRouter.

ctx1.05M tokens
Inputtext + image + video + file + audio
Outputtext
p50 TTFT10.00 s
INPUT$1.50/ 1M tokens
OUTPUT$9.00/ 1M tokens
p50 TTFT10.00 s7d
p95 TTFT10.00 s7d
TRAFFIC4.5Mtokens / 7d

Model details

What is Gemini 3.5 Flash?

Gemini 3.5 Flash is a large language model developed by Google, fine-tuned for speed and efficiency. It belongs to the Gemini family and is designed to handle multimodal inputs—text, image, video, file, and audio—while delivering fast responses. The model supports a context window of 1,048,576 tokens, enabling it to process very long sequences, such as entire books, hour-long videos, or extensive code repositories. Its maximum output length of 65,536 tokens allows for lengthy generations, including full reports or extended code files. Gemini 3.5 Flash is accessed through OrcaRouter's OpenAI-compatible API, which means you can integrate it into existing applications with minimal code changes.

Who should use Gemini 3.5 Flash?

Gemini 3.5 Flash is ideal for developers and organizations that need a balance between high throughput, low latency, and cost. It is particularly suited for production environments where inference speed matters, such as real-time chatbots, content moderation pipelines, or automated customer support. The generous context window benefits users who need to analyze large datasets, long documents, or extensive conversation histories without chunking. Additionally, teams building multimodal applications—like image captioning, video summarization, or audio transcription—can leverage its native support for multiple input types. If your workload demands extremely high reasoning capability or complex mathematics, consider a more powerful, slower model instead.

What input modalities does Gemini 3.5 Flash support?

Gemini 3.5 Flash accepts five input modalities: text, image, video, file, and audio. Text inputs can be plain strings or structured messages. Images can be passed as base64-encoded data or URLs; the model can interpret visual content like charts, diagrams, or photographs. Video inputs are supported as sequences of frames or compressed video files, allowing the model to analyze motion and temporal changes. File inputs cover common formats such as PDF, DOCX, or code files; the model can extract and reason over their content. Audio inputs can be raw or compressed (e.g., MP3, WAV), enabling speech transcription and sound analysis. All modalities can be combined in a single request, making Gemini 3.5 Flash a versatile tool for multimodal tasks.

How is Gemini 3.5 Flash accessed through OrcaRouter?

OrcaRouter exposes Gemini 3.5 Flash via its OpenAI-compatible API. The base URL is https://api.orcarouter.ai/v1, and the specific model ID is "google/gemini-3.5-flash". You can call it using any OpenAI SDK or direct HTTP requests, simply by changing the base URL and model name. Authentication is handled through an API key provided by OrcaRouter. The API supports standard chat completions endpoints, streaming, and optional parameters such as temperature, top_p, and max_tokens. OrcaRouter adds zero markup to the provider rate, so you pay exactly $1.50 per 1M input tokens and $9.00 per 1M output tokens. No additional gateway fees are applied.

Code samples

from openai import OpenAI

client = OpenAI(
    base_url="https://api.orcarouter.ai/v1",
    api_key="$ORCAROUTER_API_KEY",
)

response = client.chat.completions.create(
    model="google/gemini-3.5-flash",
    messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)

Pricing

Input / 1M tokens$1.50
Output / 1M tokens$9.00
Cache read / 1M$0.150
Cache write / 1M$0.083
CurrencyUSD

Performance

p50 TTFT
10.00 s
Output speed
10766 tok/s
p95 TTFT
10.00 s
Error rate
0.44%

Public benchmarks

49.0
AA Coding
Better than 68% of models compared
47.0
AA Intelligence
Better than 58% of models compared
51.0
AA Math
Better than 27% of models compared
GPQA Diamond
45.0 index
MMLU-Pro
59.0 index
τ²-Bench
42.0 index
Source: artificialanalysis.ai

FAQ

How much does Gemini 3.5 Flash cost on OrcaRouter?
Input tokens are $1.50 per 1 million tokens; output tokens are $9.00 per 1 million tokens. OrcaRouter bills at the provider rate with zero markup. There are no additional fees.
What is the context window size of Gemini 3.5 Flash?
It supports a context window of 1,048,576 tokens (about 1 million tokens). This includes both input and output tokens combined.
What are the main strengths of Gemini 3.5 Flash?
It is optimized for low latency, high throughput, and cost efficiency. It supports multimodal inputs (text, image, video, file, audio) and a large context window, making it ideal for real-time applications and long-document processing.
How does Gemini 3.5 Flash compare to Gemini 3.5 Pro?
Flash is faster and cheaper but has lower benchmark performance on complex reasoning and mathematical tasks. Pro is more accurate but slower and more expensive. Flash is better for high-volume, latency-sensitive applications.
How is data handled when using Gemini 3.5 Flash via OrcaRouter?
OrcaRouter acts as a proxy and does not store your data. However, Google's data handling policies apply to the underlying model. OrcaRouter recommends reviewing Google's terms for data retention and privacy.
How do I call Gemini 3.5 Flash using an OpenAI-compatible API?
Use base URL https://api.orcarouter.ai/v1, model ID "google/gemini-3.5-flash", and pass an OrcaRouter API key in the Authorization header. The API supports standard chat completions and streaming.
What output length can Gemini 3.5 Flash generate?
It can generate up to 65,536 tokens per response. This is significantly larger than many models, allowing for long-form content, code, or extended reasoning.
Is there any discount for repeated or cached tokens?
Based on the provided facts, OrcaRouter does not offer caching or volume discounts. Each token is billed at the standard rate regardless of reuse.

Embed this badge

Gemini 3.5 Flash$1.50/M in10000ms p50via OrcaRouter
HTML <a href="https://www.orcarouter.ai/models/google/gemini-3.5-flash" target="_blank"> <img src="https://www.orcarouter.ai/embed/google/gemini-3.5-flash.svg" alt="Gemini 3.5 Flash on OrcaRouter" /> </a>
Markdown [![Gemini 3.5 Flash](https://www.orcarouter.ai/embed/google/gemini-3.5-flash.svg)](https://www.orcarouter.ai/models/google/gemini-3.5-flash)