Successor to GLM-4.5 with the context window extended to 200K, in-thinking tool calls, and stronger…
from openai import OpenAI
client = OpenAI(
base_url="https://orcarouter.ai/v1",
api_key="$ORCAROUTER_API_KEY",
)
response = client.chat.completions.create(
model="z-ai/glm-4.6",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)| Input / 1M tokens | $0.600 |
| Output / 1M tokens | $2.20 |
| Cache read / 1M | $0.110 |
| Currency | USD |