DeepSeek-OCR2 β€” CrispEmbed GGUF

GGUF conversion of deepseek-ai/DeepSeek-OCR-2 for use with CrispEmbed.

Architecture

SAM-ViT-B (12L, 768d) β†’ Qwen2 encoder (24L, 896d, bidirectional) β†’ Linear projector (896β†’1280) β†’ DeepSeek-V2 MoE decoder (12L, 1280d, 64 experts top-6 + 2 shared, layer 0 dense) β†’ lm_head

Models

File Quant Size Description
deepseek-ocr2-f16.gguf F16 6.4 GB Full precision
deepseek-ocr2-q8_0.gguf Q8_0 ~3.4 GB Best quality/size balance
deepseek-ocr2-q4_k.gguf Q4_K ~2.0 GB Smallest, good quality

Performance features

  • Per-row embedding dequant (saves ~655 MB peak RSS vs full table expansion)
  • MoE decoder on Metal via ggml_mul_mat_id
  • SAM patch-embed + neck on Metal via ggml_conv_2d
  • Qwen2 encoder on Metal graph

Converted with models/convert-deepseek-ocr2-to-gguf.py from CrispEmbed.

Downloads last month
157
GGUF
Model size
3B params
Architecture
deepseek_ocr2
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support