>_TheQuery
← Glossary

Grok 3

LLM Models

xAI's June 2025 model with 1M context window, beating GPT-4o and Claude 3.5 Sonnet on AIME and GPQA with 1402 Arena Elo.

Grok 3, released by xAI in June 2025, features a 1 million token context window and was trained on the Colossus supercluster using 10x the compute of the previous state-of-the-art. This massive computational investment enabled significant performance improvements across multiple benchmarks.

The model beats GPT-4o, Claude 3.5 Sonnet, and DeepSeek V3 on challenging benchmarks including AIME (mathematics), GPQA (graduate-level science), and LiveCodeBench (coding). It achieves a Chatbot Arena Elo rating of 1402, placing it competitively among frontier models based on human preference evaluations.

Grok 3's development demonstrated the effectiveness of scaling compute and the Colossus supercluster's capabilities in training frontier models. The 1M context window combined with strong reasoning and coding performance makes it suitable for complex applications requiring both extensive context and high-quality outputs. Its competitive performance against established models from OpenAI and Anthropic highlighted xAI's rapid progress in the frontier model space.

Last updated: February 22, 2026