Compact MoE sibling of GLM-4.5: 106B total / 12B active. Same hybrid-reasoning and tool-calling stac…
from openai import OpenAI
client = OpenAI(
base_url="https://orcarouter.ai/v1",
api_key="$ORCAROUTER_API_KEY",
)
response = client.chat.completions.create(
model="z-ai/glm-4.5-air",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)| Input / 1M tokens | $0.200 |
| Output / 1M tokens | $1.10 |
| Cache read / 1M | $0.030 |
| Currency | USD |