This guideline covers key tasks such as PDF creation, quality review, progress tracking and file management. It is recommended to perform these tasks on a weekly basis.
Software/Tools Needed:
Basic command line (DOS), Exiftool, ImageMagick, Adobe Acrobat, Adobe Bridge, Adobe xml template, Microsoft Excel
Contents
QC Review
Check List
-
TIFFs scan at project specs
-
No distortion of images has occurred
-
Images follow project filenaming convention
-
Completeness (all pages and all programs for the period are included)
Update Tracking Spreadsheet
-
Open google tracking worksheet, and goto second tab SCAN STATUS
-
calculate scan rate per batch by entering number of tiffs per current batch (Window Explorer\PROCESS folder) and hours worked (per google calendar).
-
enter total GBs for TIFF files (This is to help monitor total size used for TIFF files on local workstation).
-
If any volumes were completed, update Date Completed in INVENTORY tab
Visual Image Review
-
Open adobe bridge
-
Navigate to RAW folder. Confirm folder is empty (if any TIFFs remain, need to investigate why, maybe these need to be cropped or merely need to be moved?)
-
Navigate to PROCESS folder
-
Visually span through images, checking for any blank, distorted or cropped pages.
-
If any errors found, move all related pages of that pamphlet to the RAW folder (to be processed at next session) and/or rescan any missing pages as necessary
Extract exifdata (Exiftool)
Capture a snapshot of images before any further edits. This provides an inventory of digital assets with key characteristics (resolution, size, etc.) And a quick check of filenaming conventions
Example exif.bat
ref: ISO Date and Time Formats (W3C-DTF) http://www.w3.org/TR/NOTE-datetime/(YYYY-MM-DD)
Check filename syntax
-
Open exif output file in MS excel (tip: double click on file)
-
Parse filenames into Prefix and suffix (page numbers) sections (Tip: use text-to-column function, delimiter = underscore)
-
Check filename length (Per filenaming conventions filenames should be 14 characters). TIP: insert column to right of filename and use =LEN() function. Can use the FILTER function to quickly see length of filenames. Investigate any lengths not equal to 14.
-
Create Pivot Table to summarize Prefix ID by Filenames (see steps below)
-
Investigate any Prefix IDs with greater than 10 associated files (this may be an indicator that more than one performance occurred on the same day and the filenames do not reflex that, e.g. missing Alpha character)
Create a Pivot Table[2]
1. Click any single cell inside the data set.
2. On the Insert tab, click PivotTable.
A dialog box appears. Excel automatically selects the data for you. The default location for a new pivot table is New Worksheet.
3. Click OK.
4. Drag fields: The PivotTable field list appears at right side of new spreadsheet
PDF Processing
Organize TIFFs into subfolders (CLI + Excel)
Storing TIFFs into subfolders supports easier file management as the number of files grows. This step is also a prerequisite for using ImageMagick commands to automate combining files into single PDFs.
PART I: Get Data
-
Open Windows Explorer Window
-
Go to PROCESS folder
-
RUN dir.bat (tip: double click on file) – this produces an output file: directory.txt
-
Open output file (tip: double click on file) – this produces a list of filenames plus path
PART II : Parse Data
Example of final parsed data
:
PART III: Sort Data
1. List object identifiers
Get a list of object identifiers by using a pivot table to sort data by prefix number.
-
Open template: subfolders.xls (stored in top level project folder)
-
Goto tab labelled “table” in file subfolders.xls
-
Right click over pivot table
-
Select “refresh”
Example of summarized data (using Pivot Table):
2. Batch create subfolders
basic command: mkdir directoryname
-
In template: subfolders.xls (stored in top level project folder)
-
Goto tab labelled “table”
-
Goto middle section of worksheet (highlighted below)
- NOTE: Make sure formulas are populated for each row in the Pivot Table. May need to copy formulas for new rows.
- excel tip: to select a range of cells, select first cell in range, then hold shift key, press END plus Down Arrow keys
- Copy commands
-
Open mkdir.bat should be saved within PROCESS folder (Open using NotePad)
-
Replace with new data (see figure above. Only copy commands not headers. In NotePad, select Edit>All, Control + V)
- Save changes to mkdir.bat file
- Run mkdir.bat (TIP: in windows explorer, double click .bat file)
- NOTE: to confirm folders were created by opening the PROCESS folder and count number of subfolders shown. Total number of subfolders should match number of rows in pivot table
- excel tip: to select a range of cells, select first cell in range, then hold shift key, press END plus Down Arrow keys
3. Move files into their respective subfolders
basic command: move oldpath\file newpath\file
-
In template: subfolders.xls (stored in top level project folder)
-
Goto tab labelled “filenames”
-
Goto right section of worksheet labeled "MOVE" (highlighted below)
- NOTE: Make sure formulas are populated for each row in the worksheet that has a corresponding filename (left side of worksheet). May need to copy formulas for new filenames.
- excel tip: to select a range of cells, select first cell in range, then hold shift key, press END plus Down Arrow keys
- copy commands
-
Open move.bat (Open using NotePad)
-
Replace with new data (see figure above. Only copy commands not headers. In NotePad, select Edit>All, Control + V)
- Save changes to move.bat file
- Run move.bat (TIP: in windows explorer, double click .bat file)
- NOTE: to confirm all files have been moved to corresponding subfolders by opening the PROCESS folder and viewing that no TIFFs are not within a subfolder.
Batch create PDF file (IM)
Automate creation of simple PDFs using ImageMagick commands
basic command: convert *.tif foldername.pdf
Note: In recent versions you have to add magick before convert e.g. magick convert
-
In template: subfolders.xls (stored in top level project folder)
-
Goto tab labelled “table”
-
Goto far right section of worksheet (highlighted below)
- NOTE: Make sure formulas are populated for each row in the Pivot Table. May need to copy formulas for new rows.
- excel tip: to select a range of cells, select first cell in range, then hold shift key, press END plus Down Arrow keys
- Copy commands
-
Open createPDF.bat found in the PDF folder (Open using NotePad).
- The createPDF.bat should always be stored and executed from PDF folder
-
Replace with new data (see figure above. Only copy commands not headers) In NotePad, select Edit>All, Control + V
- Save changes to createPDF.bat file
- Run createPDF.bat (TIP: in windows explorer, double click .bat file)
- NOTE: Confirm all PDF files have been created, compare count of PDFs to count of subfolders in the PROCESS folder
Batch OCR PDFs (Acrobat)
Batch Reduce PDF File Size (Acrobat)
Embed metadata (Bridge/XML template)
Batch embed general description, copyright and source metadata to all PDFs.
-
Open Adobe Bridge
-
Navigate to PDF folder (tip: filter by PDF file type)
-
Edit>Select all PDF files
-
Tools>Replace>select Shepherd template
-
View status bar in lower left corner for when operation is complete (no spinning wheel)
-
Check a sampling of PDFs to confirm
Status bar
Note[1]
Append will add values from the template to fields that are empty. Existing information is not replaced.
Replace adds values from the template to empty fields AND replaces existing values in fields.
Example of Embedded Metadata for PDFs
File Management and tracking updates
Appendix
Adobe settings
One time setups for software preferences and actions
Acrobat: Confirm software preferences
Acrobat: Create Action: Batch OCRd PDFs
-
Tools>Action Wizard>Create new action
-
Start with>folder on my computer>select PDF folder
-
Steps>Recognize Text (using OCR); confirm options are English lang. and Exact searchable image
-
Save to>Same folder as start and check Overwrite existing files
Acrobat: Create Action: Batch Reduce PDF Filesize
-
Tools>Action Wizard>Create new action
-
Start with>folder on my computer>select PDF folder
-
Steps>Document Processing>Reduce File Size
-
Save to>Same folder as start and check Overwrite existing files
Bridge: Set up XML metadata template
Other resources
Steps to batch move files or folders https://digitalriceprojects.pbworks.com/w/page/61452473/Steps%20to%20batch%20move%20files
Embedded Image and PDF Metadata
https://digitalriceprojects.pbworks.com/w/page/50636422/Embedded%20Image%20Metadata
How to Write a Batch File
http://www.wikihow.com/Write-a-Batch-File
Comments (1)
Monica said
at 1:21 pm on Jul 22, 2014
tinyurl for this page: http://tinyurl.com/mazd6od
You don't have permission to comment on this page.