Meta logo

Llama 3.3 70B Instruct: Pricing, Context Window & Benchmarks

70B

by Meta

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. [Model Card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md)

Chat with Llama 3.3 70B Instruct
Input Price
$0.58/1M tokens
Output Price
$0.71/1M tokens
Intelligence
14.2
Coding
10.7

What you can do with Llama 3.3 70B Instruct

Everyday Q&A and clear explanations

Writing help (emails, posts, summaries)

Idea generation and brainstorming

Learning support with step-by-step guidance

Composite Indices

Intelligence, Coding, Math

Standard Benchmarks

Academic and industry benchmarks

Benchmark Highlights

6 tests
GPQA
49.8%
MMLU Pro
71.3%
LiveCodeBench
28.8%
Math 500
77.3%
AIME 2025
7.7%
HLE
4.0%
Metric Value
Provider Meta
Context Window 131,072 tokens
Input Price $0.58/1M tokens
Output Price $0.71/1M tokens
Release Date Dec 6, 2024
Modalities text
Capabilities N/A

Compare Llama 3.3 70B Instruct to other models

See how it stacks up on price, quality, and overall performance.

Frequently asked questions

What is Llama 3.3 70B Instruct good for?

Use Llama 3.3 70B Instruct for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.

How much does Llama 3.3 70B Instruct cost?

Pricing is based on usage. Current rates are $0.58/1M tokens for input and $0.71/1M tokens for output.

Can I try Llama 3.3 70B Instruct for free?

Yes. You can start a chat instantly and test the model before deciding on a plan.

Does Llama 3.3 70B Instruct support images or audio?

Llama 3.3 70B Instruct focuses on text-based tasks.

Benchmarks and pricing are sourced from Artificial Analysis where available. OpenRouter specs are used as a fallback.