DeepSeek-OCR2 β CrispEmbed GGUF
GGUF conversion of deepseek-ai/DeepSeek-OCR-2 for use with CrispEmbed.
Architecture
SAM-ViT-B (12L, 768d) β Qwen2 encoder (24L, 896d, bidirectional) β Linear projector (896β1280) β DeepSeek-V2 MoE decoder (12L, 1280d, 64 experts top-6 + 2 shared, layer 0 dense) β lm_head
Models
| File | Quant | Size | Description |
|---|---|---|---|
| deepseek-ocr2-f16.gguf | F16 | 6.4 GB | Full precision |
| deepseek-ocr2-q8_0.gguf | Q8_0 | ~3.4 GB | Best quality/size balance |
| deepseek-ocr2-q4_k.gguf | Q4_K | ~2.0 GB | Smallest, good quality |
Performance features
- Per-row embedding dequant (saves ~655 MB peak RSS vs full table expansion)
- MoE decoder on Metal via ggml_mul_mat_id
- SAM patch-embed + neck on Metal via ggml_conv_2d
- Qwen2 encoder on Metal graph
Converted with models/convert-deepseek-ocr2-to-gguf.py from CrispEmbed.
- Downloads last month
- 157
Hardware compatibility
Log In to add your hardware
8-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support