Notes for developing local standards for master images of large format scans (above 17 x 24 inches physical size).
MASTER
TIFF Format, uncompressed, flat
24 bit depth
400 ppi
General file size 800MB - no greater than 1.5 GB
ACCESS
Use JP2 format for Zoomable display
see specs at https://digitalriceprojects.pbworks.com/w/page/49937056/JPEG2000%20Profile
Embedding layers into TIFFs will greatly increase file size (more than double), so all final master TIFFs should be flat if processed using Photoshop or other imaging editing tool.
Since scanning at “real size”, use of 400ppi should be sufficient
Reasonable file size may be 800MB or less. Set maximum file size of 1.5GB.
the largest map scanned to date is around 1.2GB but this map is a composite (merged in photoshop) of 3 scans. Expectation is that most maps will be much less than 1GB
If file size is greater than 1.5 GB may need to resample using 5000-10000pixels at long side
Policy: No scans of Rice Campus buildings will be placed in the IR for security purposes per Lee Pecht.
Local projects should develop a method for tracking file sizes for large scale images to estimate digital preservation storage needs. This should be reported to DSS once a year for budgeting purposes.
Some overly large files can not use adobe (Bridge or Photoshop) to embed metadata. In such cases, use EXIF tool (see below for steps).
tips for merging files, see Image editing tips
Benefit: Greatly reduce storage needs and related costs in the event of significant growth of large scale scans. This practice of using JPEG2000 as masters is currently being used at the Library of Congress, British Library and Wellcome Library.
Known issues with using JP2 format as master
loss of color profile info
Need special software to view/create JP2 files
Therefore not recommended to use JP2 as master format at this time.
Can automate creation of unique metadata for individual files using *.bat file. Tip: Use excel to capture unique metadata and then copy/paste data in *.bat file and run from command line window.
basic command example [ref]:
exiftool -Title="This is the Title" -Author="Happy Man" -Subject="PDF Metadata" image.tif
This command embeds a title, author and subject (keyword field) into the TIFF file.
This command will create a backup of the original file if you do not use the; -overwrite_original switch, this means a duplicate will exist in the folder where the updated pdf is. From example above; a file named drawing.pdf_original will be created.
all tiffs should be flat, no layers
The field in exiftool indicating layers is ImageSourceData; see Using exiftool for the QC process wiki page for “how tos”
On going activity. Monitor any mention of large scale scanning specs in the community. If new practices/methodologies are discovered that recommend different guidelines, we can re-evaluate our local practices at that time.
http://www.library.unt.edu/digital-projects-unit/standards#large-format-posters-and
Large Format (Posters and Maps) Above 17 x 24 inches (A2)
Image Types |
Bit Depth |
Color Space |
Resolution (ppi) |
Scale |
File Format |
B&W Maps/Posters |
8-bit |
Grayscale |
5000 - 10000 pixels across the longest side |
100% (1:1) |
Tiff (uncompressed) |
Color Maps/Posters |
24-bit |
RGB |
5000 - 10000 pixels across the longest side |
100% (1:1) |
Tiff (uncompressed) |
Standards for Scanned U.S. Geological Survey Historical Topographic Quadrangle Collection
http://pubs.usgs.gov/tm/11b03/pdf/TM11-B3.pdf.
Highlights:
majority of the maps are scanned at 600 PPI.
[24bit] is a true-color image appropriate for high quality printing
The goal is to produce a GeoPDF with a file size less than 20 MB.
High quality viewing and printing also is a goal for the TIFF. Typically the TIFF maps will range from 500 MB to 1.4 GB, again depending primarily on the original map sheet size.
National Geospatial Program Standards and Specifications: http://nationalmap.gov/standards/index.html
http://www.digitizationguidelines.gov/guidelines/FADGI_Still_Image-Tech_Guidelines_2010-08-24.pdf
Highlights:
400 ppi minimum for oversize documents, alternative minimum of 300 ppi (dependent on size of smallest character in document)
Resolution of 8,000 pixels across longest side
https://library.stanford.edu/research/digitization-services/labs/digital-production-group/capture-specs/large-format-scanning
Highlights:
Recommends compressed TIFF format. This may be an alternative to explore if storage sizes become non-sustainable
Recommends using JPEG2000 as access version
DCRM(C): Descriptive Cataloging of Rare Materials (Cartographic). Rare Books and Manuscripts Section of CRL (2016) http://rbms.info/files/dcrm/dcrmc/DCRMC.pdf
Recent publication. Need to explore any possible local applications from these new guidelines.