Extract text from a PDF
Pull clean plain text out of any PDF in your browser. Text PDFs extract instantly; scanned PDFs go through in-browser OCR.
Files are processed entirely in your browser. Nothing is uploaded to any server.
About this tool
PDF to Text pulls the words out of a PDF and gives them back as a plain text file you can open in any editor, copy into another document, or feed into any workflow that expects text rather than a file format. Text-based PDFs — those created from Word, Google Docs, or any authoring tool — are parsed instantly without any additional processing. Scanned PDFs go through an OCR step automatically.
When OCR is needed, Tesseract.js runs the recognition directly in your browser using WebAssembly. A small English language model (around 3 MB) is downloaded the first time and then cached. Each page of the scanned PDF is rendered as an image and fed through the OCR engine, which returns a text transcript. You can review and edit the extracted text in the page before downloading the final file.
Plain text loses the visual layout — columns merge, tables flatten, and whitespace-based alignment disappears. If you need the text to stay searchable inside the original PDF rather than extracted separately, use the OCR PDF tool instead, which embeds the recognised text as a hidden layer under the original page image.
How it works
- 1
Upload PDF
Drop or pick the PDF.
- 2
We extract text
Text-based PDFs are parsed instantly. Scanned PDFs go through OCR in your browser.
- 3
Edit & download
Clean up artifacts if you want, then download a .txt file.
Frequently asked questions
Is my file uploaded?
How does OCR work in the browser?
Will it work on a poorly scanned PDF?
Max file size?
Will it preserve layout?
Embed this tool
Let your visitors use PDF to Text without leaving your site. Paste the snippet below into any HTML page. Files stay private — everything runs in the visitor's browser.
<iframe
src="https://pdfwox.com/embed/pdf-to-text"
width="100%"
height="600"
style="border:none;border-radius:8px"
title="pdf-to-text tool"
allow="downloads"
loading="lazy"
></iframe>
<script>
window.addEventListener('message',function(e){
if(e.data&&e.data.type==='privpdf-resize'){
var f=document.querySelector('iframe[src="https://pdfwox.com/embed/pdf-to-text"]');
if(f)f.style.height=e.data.height+'px';
}
});
</script>The embed runs entirely in the visitor's browser — no files are uploaded. The iframe resizes automatically to fit its content via postMessage.
Deeper guide