
How many hours does your team spend retyping invoices, quotes or delivery notes into a spreadsheet? On June 23, 2026, France-based Mistral AI launched OCR 4, an AI document extraction model that reads those papers for you, in 170 languages, and can run on your own servers. For an SME, it is a chance to automate a tedious task without handing sensitive documents to a foreign cloud.
In brief
- Mistral OCR 4 launched on June 23, 2026: an AI model that turns a scanned document (PDF, photo, contract) into structured text and data (source: Mistral AI).
- It covers 170 languages and costs $4 per 1,000 pages on the standard API, $2 in batch mode (source: Mistral AI).
- It scores 85.20 on the public OlmOCRBench benchmark and was preferred in 72% of blind tests against leading competitors (source: Mistral AI).
- Key point for Europe: it can be deployed on your own infrastructure, so your documents never leave your servers - a regulatory advantage (GDPR, sovereignty).
- For an SME, the real value is not the technical feat but the time saved on entering invoices, quotes, contracts and forms.
What OCR is, and why this version matters
OCR (optical character recognition) is the technology that turns a document image into usable text. It has existed for decades, but it stumbled on real-world cases: misaligned tables, handwriting, multiple columns, mixed languages. As a result, many businesses kept retyping their documents by hand.
What is new with models like OCR 4 is that they do not just read characters. They understand the structure of a document: they spot a title, a table, a signature, an amount, and return them cleanly. This shift from plain reading to structured understanding is what finally makes extraction reliable for professional use.
Key takeaway
A modern OCR does more than read text: it identifies what each block represents (title, table, amount, signature) and returns it in a format your tools can use directly.
What Mistral OCR 4 can do
According to Mistral AI, OCR 4 handles PDFs, Word documents, presentations and OpenDocument files. For each page, it returns:
- structured text in Markdown, ready to copy or feed into software;
- bounding boxes that locate each element on the page (useful to check where a value came from);
- a block classification: title, table, equation, signature, and so on;
- confidence scores per page and per word, flagging uncertain passages to review.
That last point is underrated. A confidence score lets you automatically route doubtful documents to a human review, and let the rest pass without intervention. That is exactly what you need to automate without losing quality control.
The real argument for a European SME: sovereignty
Most AI extraction tools run through a US cloud: your invoices, contracts and customer files travel to servers outside Europe. For many business owners that is a blocker, especially for sensitive or GDPR-bound documents.
Mistral, a French company under EU jurisdiction, offers OCR 4 as a self-hosted container: the model runs on your own infrastructure and your documents never leave your premises. For an accounting firm, a law firm, a clinic or an industrial SME, that is a first-rate compliance argument, as the EU AI Act penalty provisions take effect on August 2, 2026 (source: European Commission).
Standard cloud OCR
Self-hosted OCR 4
Cost and performance: what the numbers say
Mistral quotes $4 per 1,000 pages on the standard API, and $2 in batch mode (less urgent bulk processing). At that rate, processing 5,000 invoices a month costs around twenty dollars: no comparison with manual data entry.
On performance, Mistral says it compared OCR 4 to more complex document parsers on a chart-heavy financial dataset: at equivalent accuracy, OCR 4 reportedly shows a cost around 8 times lower and latency 17 times lower (source: Mistral AI). In other words: just as accurate, but faster and cheaper.
One honest caveat, and Mistral raises it itself: these aggregate scores carry biases (reference annotation errors, equivalent notations scored as wrong, reading-order assumptions on multi-column documents). The company recommends evaluating the model on your own documents rather than taking the averages at face value. That is good practice for any AI tool.
How an SME can actually use it
You do not need to be a Fortune 500 company. Here is a realistic path to bring AI document extraction into an SME.
Target a case
Test on a sample
Wire the output
Filter by confidence
Measure the gain
| Use case | Document type | Expected benefit |
|---|---|---|
| Accounts payable | PDF invoices | Automated entry, fewer errors |
| Sales | Quotes and purchase orders | Faster follow-up, easier reminders |
| HR | Contracts, forms | Structured, searchable archiving |
| Logistics | Delivery notes | Automatic order matching |
FAQ
What exactly is Mistral OCR 4?
It is an AI model, released on June 23, 2026 by France-based Mistral AI, that reads a scanned or digital document (PDF, photo, contract) and turns it into structured text and data. It covers 170 languages and can run on the client company's own servers.
How much does it cost for an SME?
Mistral quotes $4 per 1,000 pages on the standard API and $2 in batch mode. Self-hosted deployment adds infrastructure costs, to weigh against your volume and confidentiality requirements.
Is it GDPR-compliant?
The self-hosted option keeps documents on your own infrastructure, which supports GDPR compliance and professional secrecy. Final compliance always depends on your organization and deployment, not on the tool alone.
Do you need technical skills to use it?
For simple use through the cloud API, a provider or an automation tool (Make, n8n) is enough. Deploying on your own infrastructure does require technical support.
Conclusion
Mistral OCR 4 is not yet another revolution to watch from afar. It is a concrete, affordable tool to remove a chore every SME knows: retyping documents. Its differentiator, sovereign self-hosting, arrives right as the EU AI Act deadlines approach. The right approach stays the same: target a specific case, test on your real documents, measure the gain.
To go further, see our guide to the European AI Act for SMEs and our case studies of AI automation in SMEs.


