Categories
Uncategorized

Tutorial – Converting DJVU Files and Creating Clean OCR PDF with Custom Res.

Tutorial: Converting DJVU Files and Creating Clean OCR PDF with Custom Res.
Blog: https://philosphicalrealms.wordpress.com

You will need the following (Some being proprietary.)
Scantailor
Adobe Acrobat
ABBYY FineReader for converting djvu to pdf.

Task:
1.) Open the djvu file with ABBYY FineReader. It will start the recognition process. You can stop it as it is required only if you are doing OCR
with ABBYY FineReader which we are not going for.( We will OCR using Adobe Acrobat which has much better OCR engine.)
Now click on Edit Image. Then on Resolution on the right. Other->Input 600 dpi. On Selection click on all pages. Then click on Apply. It will take a little time.
After you have been done with this click on File->Save Document As->PDF Document.
Purpose: This is for upscaling the document to 600 dpi or any custom resolution and converting djvu to pdf.
1.) Now convert them into TIFF using Acrobat File->Save As-> Save As Type->TIFF. Now create a new project and select directory where the tiff files are located.
2.) Process using Scan Tailor. It is mandatory to process using all the options.
3.) Now click on Create->Combine Files into a Single PDF.
4.) Finally, apply OCR to image file using Adobe Acrobat | select 600 dpi
PDF output style – Searchable Image| Downsample to 600 dpi. Click on Ok. You will be done.
This is how i do it.
Note: You can also directly convert djvu to tiff files then proceed to Step 2.