DeepSeek KI-Modelle
Chinese AI lab producing high-performance open-weight models. Strong in coding and mathematical reasoning.
V4 Pro
deepseek
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window.
V4 Flash
deepseek
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window.
V4 Pro (Reasoning, High Effort)
DeepSeek
A powerful AI model for general-purpose tasks.
V4 Flash (Reasoning, High Effort)
DeepSeek
A powerful AI model for general-purpose tasks.
V4 Pro (Non-reasoning)
DeepSeek
A powerful AI model for general-purpose tasks.
V4 Flash (Non-reasoning)
DeepSeek
A powerful AI model for general-purpose tasks.
V3.2
deepseek
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance.
V3.2 Exp
deepseek
DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures.
V3.1 Terminus
deepseek
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's performance in coding and search agents.
V3.1
deepseek
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates.
R1 0528
deepseek
May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens.
V3 0324
deepseek
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team.
R1 Distill Qwen 32B
deepseek
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1).
R1 Distill Llama 70B
deepseek
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1).
R1
deepseek
DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens.
DeepSeek V3
deepseek-ai
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions.