• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Batch Create and Optimize PDFs

Page history last edited by Monica 2 years, 2 months ago

 


 

Use Case 

  •  Adobe Acrobat can easily create a single PDF from multiple source files (e.g. TIFs) or batch create one PDF per image for many images, but there is no easy way to batch create multiple PDFs for multiple files. The following guide outlines methods for combining different tools to create multiple PDFs across different folders.
  • Optimizing PDFs for web display will improve user experience with faster downloads or online rendering of files.

 

Batch create PDFs (IM)

 

Notes

  • This method will automate the creation of image-only PDFs using ImageMagick convert option to combine multiples files into a single PDF.
  • Files produced using this method will be the same size as the source image. Therefore this step will temporarily double the storage usage until the PDFs have been optimized.
  • This process takes a lot of processing power. Recommend running job overnight or at a dedicated work station (e.g. Indus workstation).
  • This process will automatically embed the filename to the PDF Title meta-tag
  • Note: In recent versions you have to add magick before convert e.g. magick convert

 

Basic command

 

convert *.tif out.pdf

 

Example worksheet and IM commands

In the below example, individual TIF files are arranged into subfolders where each subfolder corresponds to an item level (digital object). This example uses Excel formulas to create the IM commands. The next step is to copy these commands to a *.bat file and execute. (see How to Write a Batch File, http://www.wikihow.com/Write-a-Batch-File)

 

In this example, the folder names are the digital object identifier, which is also the PDF filename.

 

 

 

Batch OCR PDFs (Acrobat)

 

Notes

  • This process takes a lot of processing power. Recommend running job overnight or at a dedicated work station (e.g. Indus workstation).
  • It's important to run OCR before optimizing or reducing file size to ensure higher quality of OCR.
  • If your document only consists of handwritten text, skip this step. TIP: You can manually OCR per page if there is a mix text-style in your document.

 

This action performs text recognition with no page auto-straighten (*) and overwrites source file in same location. Default settings can be modify.

 

One time setup 

  1. Download Action 
  2. Double click to import
  3. You will get a message “Do you want to import this action to Acrobat?
  4. Press Import
  5. You will then get a message “The action has been imported.”

 

To Run action

  1. In Acrobat > Tools > Action Wizard > Action List > Batch OCR PDFs
  2. Select files or folder
  3. (Optional) Change location by pressing the Save option and select different location. 
  4. Press Start
    While action is running will see message "Converting scanned page to Searchable Image"

 

Modify Settings

  • To change filenaming or OCR language must edit the action before running. Goto Tools > Action Wizard > Manage Actions > Select action > Edit
  • Do not select PDF Optimizer (use below options instead) 
  • (*) Do not change page auto-straighten setting. This action sets the  Output to Searchable Image (Exact). If you choose Output to Searchable Image, adobe will auto-straighten page plus reduce file size

 

 

Prepare PDF for web display

 

Option 1: Batch Reduce PDF File Size (Acrobat)

NOTE:  Reducing PDF file size produces the smallest file size. This option is best used for primarily text documents such as Books, Thesis and Dissertations, Journal articles, newsletters, magazines.

 

  • In Acrobat > Tools > Optimize PDF> Reduce File Size ...

  • Select Apply to Multiple Files

  • (Optional) Modify filenames or change save location
  • Press OK

 

 

Option 2: Batch Optimize PDF (Acrobat)

NOTE:  Optimizing PDFs option is best used for archival documents that are historical in nature such as handwritten letters, or items with lots of visual elements such as scrapbooks, photo albums, magazines (glossy print) or have small font sizes (e.g. newspapers). This option will produce a larger file size than simply using the Reduce PDF File Size method.

 

One time setup

  1. Download Action 
  2. Double click to import
  3. You will get a message “Do you want to import this action to Acrobat?
  4. Press Import
  5. You will then get a message “The action has been imported.”

 

To Run action

  1. In Acrobat > Tools > Action Wizard > Action List > Batch Optimize PDFs
  2. Select files or folder
  3. (Optional) Change location by pressing the Save option and select different location. 
  4. Press Start
    While action is running will see message "Optimizing..."

 

Settings

  1. Optimization Options
    • Check Apply Adaptive Compression
    • Set Color/Grayscale = JPEG2000
    • Adjust Quality slide with range 50%-80%
  2. Filters: defaults to Deskew On, may toggle to Off if you wish to straighten page images.
  3. Uncheck Recognize Text

 

For a single file: Goto Tools>Optimize PDF> Enhance Scanned PDF

 

 

Advanced Optimization

 

You may further customize settings as needed. Example options are:

  • Reduce size by removing layers, file attachments (for preservation purposes), but do include document information and metadata
  • Choose a down-sampling ppi and format.
  • Discard hidden layer content and flatten visible layers, etc.

 

Goto Tools>Optimize PDF> Advanced Optimization.

 

Assessment Tips

After producing PDFs you should confirm quality by sampling representative PDF for quality. For example:

 

  1. Open PDF in acrobat and enlarge to 50-100% and visually inspect the page. Is the text or image on the page blurred or pixelated? If yes, then the PDF has likely been over optimized and you should regenerate using different optimization levels.
  2. Export PDF to plain text and compare output to PDF page.  Make sure text is clean and readable. (some garbled text may appear if the source is not clear or non-textual, eg. handwritten text, photos, etc.
  3. Embed metadata to PDFs only after optimizing the file. Attempting to embed metadata to overly large size files (such as image-only PDFs), can cause errors on rare occasions. 

 

Related resources

Steps to batch move files or folders https://digitalriceprojects.pbworks.com/w/page/61452473/Steps%20to%20batch%20move%20files

 

Embedded Image and PDF Metadata

https://digitalriceprojects.pbworks.com/w/page/50636422/Embedded%20Image%20Metadata

 

How to Write a Batch File

http://www.wikihow.com/Write-a-Batch-File

 

Acrobat 8 PDF Optimizer Review, http://www.websiteoptimization.com/speed/tweak/pdf/optimizer.html

Note: older version but provides detail explanation of all the sundry options.

 

12

Comments (1)

Monica said

at 10:25 am on Apr 18, 2018

Tiny url for this page: https://tinyurl.com/y7whamxb

You don't have permission to comment on this page.