PdfWox

Make a scanned PDF searchable (OCR)

Run OCR on a scanned PDF in your browser to add an invisible text layer. You can then select, copy, and search the text.

Files are processed entirely in your browser. Nothing is uploaded to any server.

About this tool

A scanned PDF is just a collection of images — every page is a photograph of a document rather than actual text. You can't select a word, search for a phrase, or copy a paragraph because there is no text in the file, only pixels. OCR PDF fixes this by running optical character recognition on each page and embedding the recognised text as an invisible layer beneath the page image — a format known as a sandwich PDF.

The visible page is completely unchanged: the same scanned image appears when you open the result in any PDF reader. But behind it, a transparent text layer now exists that your reader uses for search, selection, and copy-paste. Ctrl+F finds your keywords. You can highlight a sentence. Assistive technology can read the document aloud. The OCR engine is Tesseract.js, running in WebAssembly inside your browser tab.

Accuracy depends on scan quality. Clean, high-contrast, straight pages typically reach 95% or better. Tilted, faded, or low-resolution scans return lower confidence. For the best results, scan at 300 DPI or higher with good lighting, and make sure the page is flat and square to the camera. Documents in languages other than English may see reduced accuracy with the default English model.

How it works

  1. 1

    Upload PDF

    Drop the scanned PDF.

  2. 2

    We OCR each page

    Tesseract.js runs in your browser; an invisible text layer is added behind the page image.

  3. 3

    Download searchable PDF

    The output looks identical but is now selectable and searchable.

Frequently asked questions

Will the page look different?
No — visually it's the same. We add an invisible text layer underneath.
Does this run on your server?
No. The OCR runs in your browser via Tesseract.js + WebAssembly. The first run downloads a ~3 MB English model; subsequent runs are fast.
Editable text?
Searchable, not editable. To edit, use PDF to Text and re-create the document.
How accurate is OCR?
Typically 90%+ on clean scans, less for low-resolution or skewed images.
How long does it take?
About 5–10 seconds per page on a modern laptop, plus the one-time model download.

Related tools

PDF to Text

Extract clean text — works on scans too.

Sign PDF

Add your signature to any PDF.

Fill PDF

Type into PDF form fields and download a filled copy.

Embed this tool

Let your visitors use OCR PDF without leaving your site. Paste the snippet below into any HTML page. Files stay private — everything runs in the visitor's browser.

<iframe
  src="https://pdfwox.com/embed/ocr-pdf"
  width="100%"
  height="600"
  style="border:none;border-radius:8px"
  title="ocr-pdf tool"
  allow="downloads"
  loading="lazy"
></iframe>
<script>
window.addEventListener('message',function(e){
  if(e.data&&e.data.type==='privpdf-resize'){
    var f=document.querySelector('iframe[src="https://pdfwox.com/embed/ocr-pdf"]');
    if(f)f.style.height=e.data.height+'px';
  }
});
</script>

The embed runs entirely in the visitor's browser — no files are uploaded. The iframe resizes automatically to fit its content via postMessage.

Deeper guide

Read the full how-to

Open the guide