Gemini 3.1 Pro Preview Custom Tools

Name: Google: Gemini 3.1 Pro Preview Custom Tools API
Brand: Google

google/gemini-3.1-pro-preview-customtools

by Google · 2026-02-25

Google Gemini 3.1 Pro Preview Custom Tools – 1M context, 95.6 τ²-Bench, multimodal via OrcaRouter.

Endpoints:/v1/chat/completions /v1beta/models/{model}:generateContent

ctx1.05M tokens

Inputtext + audio + image + video + file

Outputtext

p50 TTFT3.80 s

from openai import OpenAI

client = OpenAI(
    base_url="https://api.orcarouter.ai/v1",
    api_key="$ORCAROUTER_API_KEY",
)

INPUT$4.00/ 1M tokens

OUTPUT$18.00/ 1M tokens

p50 TTFT3.80 s7d

p95 TTFT5.68 s7d

TRAFFIC3.1Mtokens / 7d

Get the Gemini 3.1 Pro Preview Custom Tools API →▶ Try in playground </> Use via API

What is Google Gemini 3.1 Pro Preview Custom Tools?

Google Gemini 3.1 Pro Preview Custom Tools is a preview‑stage large language model developed by Google. It is designed for tasks that require long‑form reasoning, large context windows, and integration with external tools. The model accepts inputs in text, audio, image, video, and file formats, making it a multimodal solution for both content understanding and generation. Through OrcaRouter, you can call the model using an OpenAI‑compatible API at base URL https://api.orcarouter.ai/v1 with the model ID "google/gemini-3.1-pro-preview-customtools". This compatibility streamlines integration for teams already familiar with the OpenAI SDK or similar clients. As a preview model, it may have limitations in availability or performance compared to stable releases.

Who is this model intended for?

This model is suited for developers, data scientists, and enterprise teams who need to process very long documents (up to 1 million tokens) or combine multiple input modalities (text, audio, image, video, files) in a single reasoning step. It is particularly valuable for tasks that involve custom tool use—where the model must decide when and how to call external functions or APIs. Teams working on research, legal analysis, media processing, or advanced automation will find the large context and strong benchmark performance useful. Because it is a preview, it may be ideal for prototyping and evaluation rather than production systems that require guaranteed uptime or latency.

Key features at a glance

The model offers a context window of 1,048,576 tokens and a maximum output of 65,536 tokens. Input modalities cover text, audio, image, video, and file uploads. The headline benchmark score is 95.6 on τ²-Bench, a test of tool‑use reasoning. Pricing is $4.00 per 1M input tokens and $18.00 per 1M output tokens, with zero markup when accessed through OrcaRouter. The API is OpenAI‑compatible, and the model ID is "google/gemini-3.1-pro-preview-customtools". As a preview, it reflects the latest capabilities but may be subject to change.

Code samples

from openai import OpenAI

client = OpenAI(
    base_url="https://api.orcarouter.ai/v1",
    api_key="$ORCAROUTER_API_KEY",
)

response = client.chat.completions.create(
    model="google/gemini-3.1-pro-preview-customtools",
    messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)

Pricing

Input / 1M tokens	$4.00
Output / 1M tokens	$18.00
Cache read / 1M	$0.400
Currency	USD

Performance

last 7 days

p50 TTFT

3.80 s

Output speed

215 tok/s

p95 TTFT

5.68 s

Error rate

Public benchmarks

Last evaluated 2026-02-19

55.5

AA Coding

Better than 75% of models compared

57.2

AA Intelligence

Better than 80% of models compared

GPQA Diamond

94.1

Humanity's Last Exam

44.7

IFBench

77.1

Long-Context Recall

72.7

SciCode

58.9

TerminalBench Hard

53.8

τ²-Bench

95.6

Source: artificialanalysis.ai

More from Google

See all models from google →

Gemini 3.1 Pro PreviewFlagship

google/gemini-3.1-pro-preview

$2.00 in · $12.00 out / 1M

1.05M ctx· quality 10/10

Gemini 3 Flash PreviewCheapest

google/gemini-3-flash-preview

$0.50 in · $3.00 out / 1M

1.05M ctx· quality 9/10

Gemini 3.5 Flash

google/gemini-3.5-flash

$1.50 in · $9.00 out / 1M

1.05M ctx· quality 9/10

FAQ

What is the cost to use Google Gemini 3.1 Pro Preview Custom Tools?

Pricing is $4.00 per 1 million input tokens and $18.00 per 1 million output tokens. These are billed at the provider rate with zero markup when accessed through OrcaRouter.

What is the context window size?

The context window is 1,048,576 tokens (approximately 1 million tokens). Maximum output is 65,536 tokens per request.

What are the model's main strengths?

It excels at tasks requiring tool use (scored 95.6 on τ²-Bench), has a very large context window, and accepts multimodal input (text, audio, image, video, file).

How does it compare to Gemini 1.5 Pro?

This preview model has a higher τ²-Bench score and is optimised for custom tool use. It is more expensive than Gemini 1.5 Pro, which may be suitable if you do not need the latest tool‑use performance.

How can I call this model via an OpenAI‑compatible API?

Set base URL to https://api.orcarouter.ai/v1, model ID to google/gemini-3.1-pro-preview-customtools, and use your OrcaRouter API key. The API follows OpenAI's chat completions format.

What input modalities does it support?

It supports text, audio, image, video, and file inputs. These can be combined in a single request for multimodal reasoning.

How does data handling work?

The available facts do not specify data retention or privacy policies. You should consult OrcaRouter's terms of service and Google's data usage policies for details on how your data is handled.

Is there any caching or prompt caching available?

No information about caching is provided in the available facts. Check OrcaRouter's documentation for any caching features that may reduce costs for repeated inputs.

What is the expected latency?

Exact latency figures are not provided. In general, models with large context windows may have higher latency due to processing time. Test with your own workloads to determine performance.

Can I use this model for production?

It is a preview model, so it may have lower reliability or uptime guarantees compared to stable releases. Use it for prototyping and evaluation; consider stable models for production systems.

Embed this badge

Paste into your blog post

Google: Gemini 3.1 Pro Preview Custom Tools•$4.00/M in•3800ms p50•via OrcaRouter

HTML <a href="https://www.orcarouter.ai/models/google/gemini-3.1-pro-preview-customtools" target="_blank"> <img src="https://www.orcarouter.ai/embed/google/gemini-3.1-pro-preview-customtools.svg" alt="Google: Gemini 3.1 Pro Preview Custom Tools on OrcaRouter" /> </a>

Markdown [![Google: Gemini 3.1 Pro Preview Custom Tools](https://www.orcarouter.ai/embed/google/gemini-3.1-pro-preview-customtools.svg)](https://www.orcarouter.ai/models/google/gemini-3.1-pro-preview-customtools)

Gemini 3.1 Pro Preview Custom Tools

What is Google Gemini 3.1 Pro Preview Custom Tools?

Who is this model intended for?

Key features at a glance

What input modalities does the model support?

How does custom tool use work?

How large is the context window and max output?

When should you consider a cheaper model?

What is the τ²-Bench score and what does it measure?

What are the model's demonstrated strengths?

What are the model's limitations?

What is the expected latency and speed?

How much does the model cost per token?

What does 'zero markup' mean?

Are there any cost trade‑offs to consider?

How do I call the model via OrcaRouter's API?

What parameters are supported?

How can I migrate from OpenAI's API?

What authentication does OrcaRouter require?

How does this model compare to Gemini 1.5 Pro?

How does it compare to GPT‑4o?

How does it compare to Claude 3 Opus?

Code samples

Pricing

Performance

Public benchmarks

More from Google

FAQ

Embed this badge

Gemini 3.1 Pro Preview Custom Tools

Model details

What is Google Gemini 3.1 Pro Preview Custom Tools?

Who is this model intended for?

Key features at a glance

What input modalities does the model support?

How does custom tool use work?

How large is the context window and max output?

When should you consider a cheaper model?

What is the τ²-Bench score and what does it measure?

What are the model's demonstrated strengths?

What are the model's limitations?

What is the expected latency and speed?

How much does the model cost per token?

What does 'zero markup' mean?

Are there any cost trade‑offs to consider?

How do I call the model via OrcaRouter's API?

What parameters are supported?

How can I migrate from OpenAI's API?

What authentication does OrcaRouter require?

How does this model compare to Gemini 1.5 Pro?

How does it compare to GPT‑4o?

How does it compare to Claude 3 Opus?

Code samples

Pricing

Performance

Public benchmarks

More from Google

FAQ

Embed this badge