Meta S New Llama 3 3 70B Model Matches 405B S Performance At A Lower Cost

Well, according to Meta, the new Llama 3.3 70B model nearly matches the performance of the larger Llama 3.1 405B model. That’s a huge improvement since its size is much smaller and can be served at a much lower cost. But it doesn’t outrightly beat the larger 405B model in all benchmarks.
The Llama 3.3 70B model scores 86.0 and 88.4 in MMLU and HumanEval benchmarks, respectively. The 405B model does slightly better and achieves 88.6 and 89.0 in the same set of tests. That said, the Llama 3.3 70B model scores better in MATH and GPQA Diamond.
Basically, Meta is saying that if you have text-only applications, you should use the new Llama 3.3 70B model rather than the 405B model. Due to its smaller size, it costs just $0.1 / $0.4 for 1 million input/output tokens. The larger 405B model costs $1.0 / $1.8 for 1 million input/input tokens.
As for language support, the Llama 3.3 70B model supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Its knowledge cutoff date is December 2023 and the context length is up to 128K tokens. You can chat with the new Llama 3.3 70B model on HuggingChat for free.