Embedded Image and PDF Metadata



 

Purpose

Embedding metadata into photo images supports long term access and re-use of these images.

Images may easily become separated from the online repository and the carefully crafted metadata describing these images. Embedding textual description into the image file will help ensure any disassociated files retain some basic information that will help a user identify the image. The basic metadata we feel a user may need to identify or re-use images includes title or caption, copyright and contact (source of the physical photograph). Fondren Library will continue to monitor technical developments in digital photography and evaluate how they may influence these guidelines.

 

 

Preferences

 

 

 

 

 

Method to embed boilerplate metadata to images or PDFs in Bridge

Open Bridge and navigate to directory of files needing to have metadata embedded. 

Open the metadata pane for IPTC metadata. (If it is not showing, go to the menu for Window and select Metadata Panel.)

Select all relevant files in your directory.

Enter boilerplate metadata in the appropriate IPTC metadata fields, then click elsewhere, and notice that Bridge is processing the changes.

 

Tools to batch import metadata to images

 

 

 

 

 

Map DC to IPTC-core

This information maps the Dublin Core metadata for each item to the equivalent IPTC-core standard, which is how the metadata will be embedded in our master TIFF image files (whether they are TIFFs for images or text). Photo metadata will reside in the master (tiff) files and then be carried over to derivative access files such as jpg, jp2, and pdf. The IPTC Title field is used to store the digital identifier (aka filename) of the image for internal administrative purposes.

 

More detailed information using the Adobe script: Photometadata import script

Using QC'd metadata XLS file, create a new xls file with the following fields,
and make the first column the filename (including extension), then save as csv or txt:

 

Dublin Core

 

IPTC-Core

Note

dc.identifier.digital

Title

 

dc.title

Description

 

dc.rights

Copyright notice

 

dc.rights.uri

Rights Usage Terms

Copyright info URL(*) 

"Copyright info URL" (*) is a IPTC-legacy field.  This is the actual field used in our batch processes. The "Rights Usage Terms" field is intended for a full text version of rights instructions. We elect to reference an appropriate creative commons license instead,

na

Copyright status (*) 

This is a pull-down menu in Bridge CS5. Values are: "Copyrighted", "Public Domain" or "unassigned". For the Adobe Bridge script we use, enter "TRUE" for Copyrighted, "FALSE" for public domain, "0" for unassigned

na

Source

Woodson Research Center, Rice University, Houston, Texas

 

Other Notes:

 

(*) Copyright info URL and Copyright Status are s not strictly speaking IPTC-Core, but of the Photoshop XMP Namespace. The data supplied in these fields supports the Copyright Notice field.

 

(**) A method for batch editing "Copyright info URL" is to use the exiftool command. Example:

exiftool -WebStatement="http://creativecommons.org/licenses/by/3.0/" -overwrite_original *.tif -r foldername

 

Copyright status scenarios:

  1. If Copyright Status = Public Domain  THEN COPYRIGHT NOTICE = "This material is in the public domain and may be freely used, with attribution. This work is licensed under a Creative Commons Attribution 3.0 Unported License."  

  2. If Copyright Status = Unknown (orphan works) THEN COPYRIGHT NOTICE = "The copyright holder for this material is either unknown or unable to be found. This material is being made available by Rice University for non-profit educational use under the Fair Use Section of US Copyright Law. This work is licensed under a Creative Commons Attribution 3.0 Unported License."  

  3. If Copyright Status = Copyrighted THEN COPYRIGHT NOTICE = "Rights to this material belong to Rice University. This digital version is licensed under a Creative Commons Attribution 3.0 Unported license." 

 

Embedding PDF metadata for text items

 

Manual or using GUI interface

1. Create PDF in Adobe Acrobat

2. Go to File menu -- to Properties

3. On the Description tab, click "Additional Metadata"

4. Enter metadata as follows:

Document title
dc.identifier
Description dc.title
Copyright Status (choose from pull down menu)
Copyright Notice dc.rights
Copyright Info URL dc.rights.uri

5. Click OK and Save (as a regular PDF, not PDF/A. PDF/A has significant benefits especially for born digital text documents such as embedding fonts, but does not allow users to annotate or use the PDF in other ways that might inhibit their work, and we mainly create PDFs from scanned in page images such as handwritten or typed documents.)

 

Batch process using command line

directly set a single or multiple metadata values per file from command-line. Use *.bat method to update muliple PDF files at a time.

 

exiftool -Title="wrc00325" -Description="This is my awesome title" -Subject="Dogs at play" output.pdf

 

The command will create a backup of the original file if you do not use the -overwrite_original switch, this means a duplicate will exist in the folder where the updated pdf is. From example above; a file named  output.pdf_original will be created. (1)

 

embed-PDFmetadata-template.xlsx

 

References

 

 “The IPTC Photo Metadata Standard - IPTC Core & Extension.” International Press Telecommunications Council IPTC, July 2010. <http://www.iptc.org/cms/site/index.html?channel=CH0099> [PDF downloadable; includes the IPTC Core 1.1 and the IPTC Extension 1.1 specifications.]

 

David Riecks. “The IPTC-NAA Standards.” Controlled Vocabulary http://www.controlledvocabulary.com/imagedatabases/iptc_naa.html

 

Jeffrey’s Exif Viewer [online Tool]  http://exif.regex.info/exif.cgi

 

“Using Jeffrey’s Exif Viewer to Expose Exif, IPTC, and XMP Photo Metadata.” http://www.controlledvocabulary.com/imagedatabases/exiftoolonline.html

 

Photometadata.org. “Guide to Photo Metadata Fields” http://www.photometadata.org/META-Resources-Field-Guide-to-Metadata.