Elon Musk S Xai Announces Grok 1 5 With 128K Context Length


To showcase Grok-1.5’s problem-solving capability, xAI has benchmarked the model on popular tests. In the MMLU test, Grok-1.5 scored 81.3% (5-shot), higher than Mistral Large and Claude 3 Sonnet. In the MATH test, it scored 50.6% (4-shot), again beating Claude 3 Sonnet. In the next GSM8K test, it scored a whopping 90%, but with 8-shot prompting. Finally, on the HumanEval test, the Grok-1.5 model scored 74.1% with 0-shot.
xAI has also increased the context length from 8K tokens to 128K tokens on the Grok-1.5 model. To evaluate its retrieval capability, the company ran the NIAH test (Needle in a Haystack), and it achieved perfect results.
As this is an incremental model, xAI has not disclosed the parameter size. However, to give you an overview, Grok-1 is trained on 314 billion parameters, one of the largest open-source models out there. It’s also based on the Mixture-of-Experts (MoE) architecture. xAI also released the model weights and the architecture under the Apache 2.0 license which is great.
Recently, Anthropic launched its family of Claude 3 models which have shown great promise and in many cases, the largest Opus model has already outranked OpenAI’s GPT-4 model. OpenAI is said to be working on an intermediate GPT-4.5 Turbo model and GPT-5 is also on the cards and may launch in the summer of 2024. Google’s Gemini 1.5 Pro model has also demonstrated incredible multimodal capabilities over a long context window. Among the powerful proprietary models, xAI’s Grok-1.5 sits somewhere in the middle, if we go by its benchmark numbers. We have to wait and see how well it does on reasoning tests. Anyway, what do you think about the Grok-1.5 model? Let us know in the comments below.