Does it work on handwritten text?

It works, but with lower accuracy (typically 60-80%). The cleaner the handwriting, the better. Cursive is harder than print. Always plan to review the result.

Can OCR handle multilingual PDFs?

The model expects one language at a time. If your PDF is mixed, split by language and process separately. Alternatively, use the dominant language and manually fix the rest.

Is there a page limit?

Free plan has monthly operation limits, but each operation can handle long PDFs. For 100+ page docs, consider the paid plan for faster processing.

Is the extracted text private?

Yes. Processing happens on our servers (no third parties), files are deleted within minutes after download, and nothing goes to AI training. Full policy at /privacy.

Does OCR preserve table layout?

Plain text (.txt) loses layout. Searchable PDF keeps the original visual. To export tables in editable format (Excel, Google Sheets), use the PDF→Excel tool after OCR.

Does it work on phone photos (not scans)?

Yes, but quality depends heavily on the photo. Uneven lighting, shadows, and tilted perspective degrade results. Apps like Google Drive's scanner or CamScanner pre-process the photo first — use those upstream.

Tutorial · ocr

How to extract text from a scanned PDF (OCR)

Got a scanned doc and can't copy a single word out of it? OCR fixes that. Convert it to searchable PDF or plain text in seconds.

4 min readUpdated on April 25, 2026

You receive a scanned contract and try to highlight a clause to paste into an email — nothing happens. The cursor passes right over the text without reacting. That's because the "text" in the PDF is actually a photograph of a piece of paper. To any program, you're just selecting a JPG.

OCR (Optical Character Recognition) is the technology that lets a machine "read" those images and convert them into real text. The result: a document you can search with Cmd/Ctrl-F, copy passages from, edit in Word, index for search, or feed to an LLM.

How to tell if a PDF is scanned

3-second test: open the PDF and try to select a word with your mouse. If it highlights cleanly word by word, the PDF has real text and doesn't need OCR. If you can only draw a rectangle on top of it, it's an image — you need OCR.

Another tell: hit Cmd/Ctrl-F and search for a word you KNOW is in the document. If it doesn't find it, the content is an image.

When OCR is the right call

Old digitized contracts — to extract clauses, dates, amounts
Scanned receipts — to populate expense spreadsheets
Books and academic papers — to cite passages, translate
HR documents — IDs, payslips, certificates for record-keeping
Medical history — to digitize old patient records
Field research — partially handwritten survey forms

Step-by-step: extract text with OCR

1. Upload the scanned file

Works with PDF, JPG, PNG and TIFF. You can upload multi-page documents — OCR processes everything in one shot and preserves the order.

2. Pick the content language

We support English, Portuguese, and Spanish. OCR uses different models per language — picking the wrong one tanks accuracy. If your document is multilingual (a bilingual report, for instance), process each part separately.

3. Choose output format

Searchable PDF — keeps the original visual but adds an invisible text layer on top. You can Cmd/Ctrl-F, copy and paste normally.
Plain text (.txt) — just the extracted content, no formatting. Great for spreadsheets, importing into systems, feeding to AI.
Word (.docx) — converts with basic formatting preserved (paragraphs, alignment). Good for editing.

4. Process and download

OCR is slower than other conversions (each page takes 2-10 seconds depending on resolution). When it finishes, you download the file in your chosen format.

How to improve OCR quality

OCR accuracy depends heavily on the source image quality. Some tips:

Scan at 300 DPI minimum — below that, small letters blur
Make sure the page is straight — tilted scans and angled shots confuse the recognition
Clean stains and folds before scanning — marks become random characters
Prefer white background and black ink — high contrast = better reading
Avoid screen captures — moiré and pixelization hurt accuracy

Bonus: OCR + other tools

OCR unlocks several follow-ups:

OCR + Compress — after OCR the PDF gets MUCH lighter (text weighs far less than image)
OCR + Word — export to .docx for editing and review
OCR + Excel — if the document is tabular, OCR + Excel converter splits into columns
OCR + ChatGPT — drop the extracted text into AI to summarize, translate, or analyze

Frequently asked questions

On printed, clean, high-resolution text: 98-99%. On legible handwriting: 70-90%. On scribbles: 40-60%. Always proofread when accuracy matters (contracts, accounting data).

More guides