Mistral OCR 4 is an AI Productivity tool. AI-powered OCR that extracts structured text from documents in 170 languages. Key features include Bounding Box Extraction, Block Classification, and Confidence Scores. Best for legal professionals, data scientists and analysts and scientists and researchers.
About Mistral OCR 4
Key Features
<strong>Bounding Box Extraction.</strong> Mistral OCR 4 returns paragraph-level bounding boxes that tell you exactly where each text block lives on the page, making it easy to highlight content and build reliable data pipelines.
<strong>Block Classification.</strong> The tool automatically classifies content into typed blocks like titles, tables, equations, signatures, and more, so you know what role each element plays in your document.
<strong>Confidence Scores.</strong> Get per-word and per-page confidence scores that help you flag sections needing human review and route low-confidence regions to verifiers while auto-approving the rest.
<strong>170 Language Support.</strong> Process documents in 170 languages across 10 language groups with strong performance on rare and low-resource languages that other systems struggle with.
<strong>Self-Hosted Deployment.</strong> Run Mistral OCR 4 in a single container on your own infrastructure to keep sensitive documents within your environment for data residency, sovereignty, and compliance requirements.
<strong>RAG Pipeline Integration.</strong> The structured markdown output with classified blocks slots directly into retrieval-augmented generation pipelines and works as an ingestion component for enterprise search and domain-specific retrieval.
Frequently Asked Questions
Mistral OCR 4 is an optical character recognition API from Mistral AI that extracts structured text from documents. Unlike basic OCR, it returns bounding boxes, block classifications, and confidence scores alongside the text. It supports 170 languages and common formats like PDF, DOC, PPT, and OpenDocument.
Mistral OCR 4 costs $4 per 1,000 pages through the API. If you use the Batch API, you get a 50% discount, bringing the cost down to $2 per 1,000 pages. The Document AI version in Mistral Studio is priced at $5 per 1,000 pages.
Independent annotators preferred Mistral OCR 4 over every competing system tested, with a 72% average win rate across 600+ real-world documents in 12+ languages. It also scored 85.20 on the public OlmOCRBench leaderboard, the top score among tested models.
Yes, Mistral OCR 4 can run in a single container for fully self-hosted deployments. This option is available to enterprise customers and lets you keep document data within your own infrastructure for data residency, sovereignty, and compliance needs.





