PaddleOCR-VL-0.9B β CrispEmbed GGUF
CrispEmbed-native GGUF quantizations of PaddlePaddle/PaddleOCR-VL.
End-to-end VLM-based OCR: text recognition, table extraction, formula recognition, chart understanding. 109 languages.
Files
| File | Size | Description |
|---|---|---|
paddleocr-vl-0.9b-q4_k.gguf |
1.3 GB | 4-bit K-quant β smallest |
paddleocr-vl-0.9b-q8_0.gguf |
1.4 GB | 8-bit quantization β recommended |
paddleocr-vl-0.9b-f16.gguf |
2.3 GB | fp16 reference |
Model
- Architecture: NaViT-style ViT (27L, 1152d, SigLIP 2D RoPE + learned position embeddings)
- Projector (pre-norm β 2Γ2 spatial merge β MLP)
- ERNIE-4.5-0.3B LLM decoder (18L, 1024d, 16/2 GQA, MRoPE, SwiGLU)
- Parameters: ~0.9B total
- Languages: 109 (multilingual)
- Tasks: OCR, Table Recognition, Formula Recognition, Chart Recognition
- License: Apache 2.0
Usage with CrispEmbed
# OCR
./crispembed -m paddleocr-vl-0.9b-q8_0.gguf --ocr document.png
# With specific prompt
./crispembed -m paddleocr-vl-0.9b-q8_0.gguf --ocr-prompt "Table Recognition:" table.png
Conversion
git clone https://github.com/CrispStrobe/CrispEmbed
cd CrispEmbed
python models/convert-paddleocr-vl-to-gguf.py \\
--model PaddlePaddle/PaddleOCR-VL \\
--output paddleocr-vl-0.9b-f16.gguf --dtype f16
./build/crispembed-quantize paddleocr-vl-0.9b-f16.gguf paddleocr-vl-0.9b-q8_0.gguf q8_0
License
Apache 2.0 β same as the base model.
- Downloads last month
- 11
Hardware compatibility
Log In to add your hardware
8-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support