ai-gpu
DeepSeek R1 VRAM Calculator
Calculate VRAM for DeepSeek R1 (671B MoE) and its distilled variants (1.5B–70B). The full R1 requires massive multi-GPU setups; distilled versions run on consumer hardware.
DeepSeek R1: Full Model vs Distilled Variants
DeepSeek R1 is a reasoning-first LLM from DeepSeek AI, trained using reinforcement learning to excel at math, code, and logical reasoning. The full 671B MoE model rivals GPT-4o, but the distilled variants are what most engineers will actually run.
VRAM by Model Size at INT4
| Model | Params | VRAM (INT4) | Minimum GPU |
|---|---|---|---|
| R1 full | 671B MoE | ~335 GB | 8× A100 80GB |
| R1-Distill-70B | 70B dense | ~38 GB | A100 40GB |
| R1-Distill-32B | 32B dense | ~18 GB | RTX 4090 24GB (tight) |
| R1-Distill-14B | 14B dense | ~8 GB | RTX 3070 8GB |
| R1-Distill-7B | 7B dense | ~4.5 GB | Any 6GB+ GPU |
| R1-Distill-1.5B | 1.5B dense | ~1 GB | CPU-only feasible |
Why the Distilled Models Are Remarkable
The distilled variants inherit R1's chain-of-thought reasoning style through knowledge distillation. R1-Distill-7B beats GPT-4 on several reasoning benchmarks — running on a consumer RTX 3070.
MoE Memory Note
The full R1 671B is MoE — only ~37B parameters are active per token. But ALL 671B parameters must be in VRAM. INT4 brings the memory footprint from 1.3 TB (FP16) to ~335 GB, which still requires serious multi-GPU hardware.
Frequently Asked Questions
How much VRAM does DeepSeek R1 671B need?
The full DeepSeek R1 is a 671B Mixture of Experts model. At INT4, it needs ~335 GB VRAM — requiring 8× A100 80GB or 5× H100 80GB. In practice, most users run the distilled variants (7B–70B) which offer strong reasoning on consumer hardware.
What are the DeepSeek R1 distilled models?
DeepSeek released smaller distilled versions trained from R1: R1-Distill-Qwen-1.5B, 7B, 14B, 32B and R1-Distill-Llama-8B, 70B. The 7B distill runs at GGUF Q4 on any 8GB GPU. The 70B distill runs at INT4 on an A100 40GB, with reasoning quality close to the full 671B model.
Is DeepSeek R1 better than GPT-4 for reasoning?
DeepSeek R1 matches or exceeds GPT-4o on AIME 2024, Codeforces, and MATH benchmarks — at a fraction of the training cost. For open-source local deployment, R1-Distill-70B-INT4 is the strongest reasoning model available below $2/hr cloud cost.
How do I run DeepSeek R1 locally?
Use Ollama: `ollama run deepseek-r1:7b` (for the 7B distill, ~4.5GB VRAM) or `ollama run deepseek-r1:70b` (for 70B distill at Q4, ~40GB VRAM). For the full 671B model you need a multi-GPU cluster — use vLLM with tensor parallelism.