Recommend opening csv file in Excel and using filters and pivot tables to then summarize or analyze data.
Command
exiftool -csv command provides an exhaustive list of data.
run c:\>exiftool -csv -r group >output.csv
Base File Info
Typically use this info for summarizing. Also can be helpful for batch processing (rename, move or calculate storage needs).
Field
|
Description
|
SourceFile
|
Full directory path including file name
|
Directory
|
Subfolders. Usually equivalent to digital object ID
|
FileName
|
with extension
|
FileSize
|
includes size notation: KB, MB, etc.
|
Example:
TIPS:
- if using SourceFile data to run a batch rename job, will need to switch forward slashes to backward slashes. Just use Excel’s global replace function.
- For calculating size, parse out numbers from text in the FileSize column, using the Text to Column Excel function
- Can validate filenames using =Len() command and Text to Column Excel function
Technical Data
Typically use this data to validate digitization specs for QC purposes or for preliminary assessment purposes (e.g. when first reviewing a bunch of files from a donor, scholar or vendor).
May also re-purpose this data for filling out metadata about the creation process of the digital files (e.g. dc.date.digital, dc.digitization.specifications).
Data available will change greatly depending on the devise used to capture the image or the file type (tiff vs pdf). Typically, digital cameras will provide more technical data than flatbed scanners. Some data will not be available regardless unless specifically enabled during the scanning process (e.g. GIS data, embedded color profile, etc.). This may require a specific setting being “turned on” in the equipment setup.
Field
|
Description
|
ColorSpaceData
|
Confirm listed as RGB and not CMYK. Color spaces for printing purposes sometimes are incompatible with down stream functions like the Media Filter job which auto creates thumbnails
|
Compression
|
For master files this value should always be uncompressed (not LZW). Also if LZW, the Jpeg2000 script will not work to auto create this derivative type.
|
CreateDate
|
May be used for dc.date.digital. Convert to YYYY format
|
ICCProfileName
|
Confirm matches project specs. Items scanned on flatbed scanners may not have this information.
|
ImageHeight
|
|
ImageSize
|
Use to confirm all tiffs within the same object are the same size.
|
ImageSourceData |
This should be blank to confirm tiffs are flat, with no layers. |
ImageWidth
|
|
MIMEType
|
All files should have a MIMEType. If there is none, that may indicate a corrupt file, so investigate
|
Orientation
|
|
XResolution
|
confirm all files were scanned at PPI project specs
|
YResolution
|
|
PageNumber
|
IF file is of PDF format, the page count will be supplied here. This information can be used to double check that PDFs contain all the TIFFs associated with an object and also used to populate dc.format.extent field in the object metadata record
|
Example:
Descriptive Data
Another important step is to confirm some basic descriptive data is embedded within the file itself. There may be multiple fields that use the same data source and the label of the fields may be different in different software (e.g. Adobe Bridge, Photoshop, etc.). Just make sure all data prescribed for a particular project is populated per image.
Example:
NB: Exiftool's output of the Source field truncates the data using exiftool. Therefore the full text is not “shown” but if you open the file in an image software or viewer (eg. Adobe Bridge or Photoshop), the full text is visible.
Additional guides
Comments (0)
You don't have permission to comment on this page.