• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


Rice University Historical Images workflow

Page history last edited by Jenn Miller 10 years, 10 months ago





General Information

  • In support of Centennial Celebration efforts
  • Collection of  photographs of historical Rice University images ( People, Places and Things) housed in Woodson Research Center
  • Most digital images created, and just need to be more fully described and uploaded to IR @ Rice University Historical Images and Key Documents (http://scholarship.rice.edu/handle/1911/61405)


Contact info: Amanda Focke, afocke@rice.edu


Planning notes

Process / Procedure Questions:

  • Will need to provide server permissions to folder for personnel - DONE
  • Will need to set up permissions to add items in the IR community (all collections) - Amanda to check re WRC staff, 2/2012
  • Need customized submission form - DONE
  • Draft Basic Process flow chart for this project - DONE
    • Drafting batch file creation and ingest workflows - DONE
  • Outline description needs, especially for Rights - DONE
    •  Jenn to review for any enhancements - DONE
  • Test workflow - DONE
    • working on batch processes - DONE


Planning Documentation

Digitization Plan Checklist

Return to the Top

High level workflow and schedule

Historical Images Workflow - Schedule


Digitization Workflow 

Project Activities: Prepare Materials > Scan > QC images > Create access copy > Prepare Metadata >Place online


Material selection

1. Archivists select from pre-existing scanned images
Note: This project requires close attention to avoid entering digital duplicates.

  • Archivists must be very familiar with what is already in the IR, search the IR by words/terms likely to be used to describe the item in question and be confident the item is not already online before processing it.
  • Digital duplicates are images of the exact same source photograph or document. Images of the same topic (e.g. building) are not considered "digital duplicates"  for this purpose.

2. Move selected tiff files to project folder for HistoricalImages on server (masters/1-awaiting-metadata)


Scanning and image QC



  1. Image QC guidelines : follow general best practices (Quality control checks)
    1. acceptable variances for retro images include:
      1. resolution: min. 300 ppi @ full color (24 bit) @ original size
      2. sRGB color okay (eg do not need to embedded ICC profile)
    1. For graphical images: Embed copyright statement (or any other descriptive or identifying metadata) in the original tiff image. This can be done through batch process. Examples: 
      • Copyright : This image is in the public domain. Please contact the Woodson Research Center at woodson@rice.edu for questions regarding publication or access to high quality master files 
      • Description : Survey of Rice Institute 1st Engineering class 1917, 3/2/11, 9:47 AM,  8C, 8112x10929 (523+914), 117%, Repro 2.2 v2,  1/20 s, R78.9, G61.4, B94.0
        Rice University
  2. Assign digital identifier number WRC file name convention
  3. Update Shared Shelf with WRC#, status and basic metadata



Metadata creation

1. Masters will be located in subfolder "masters/1-awaiting-metadata"

     (note: you will likely want to make a quick jpg for upload into Shared Shelf as you write the metadata -

     this will later get overwritten by the batch conversion to jpg / metadata embedding step)

2. Create metadata in Shared Shelf and the master WRCcentraldigiobject list (internal xls document)

2.) Update "Item Status" column in Shared Shelf to show:

     "1. New" if the item is in Shared Shelf but the basic metadata is not complete

     "2. Basic metadata entered" when the metadata is complete

3.) Move master files to the subfolder "masters/2-ready-to-embed-metadata"

4.) Export metadata from Shared Shelf and follow QC steps


Metadata into TIFF headers

1.) Use the Adobe Bridge Import script steps, which begin with creating a csv or txt file of the required metadata (see: Crosswalk for automated metadata into tiff files  )

2.) Move master files:

     For images, move the Master tiff files in the Masters to the subfolder "masters/3-awaitingJP2" or "masters/3-awaitingJPG" as appropriate

     For text files, move the Master tiff files in the Masters to the subfolder "masters/3-awaiting-PDF"

3.) Change Shared Shelf "Item Status" to "3. Photometadata embedded"


Create derivative files

  • Selected master files to be batch converted on a timely basis (e.g. weekly)
  1. types of derivative files
    1. PDFs (for Text)
      1. Use files in Masters subfolder "3-awaiting-PDF", create PDF in Acrobat, with OCR, with rights metadata embedded
    2. For images larger than 5x7" create two files:
      1. JPEG2000 plus jpeg (simple) for downloadable version.  Any master files requiring JP2 should be moved to the subfolder "3-awaitingJP2" with a COPY of them going to the top-level folder Z:\JP2Convert. (That COPY gets overwritten as a JP2 automatically overnight by a script.) 
      2. For images 5x7 or smaller, create one file - a jpeg (simple) for downloadable version, 5" wide and 150dpi.  (If desired, use IrfanView for batch conversion to simple jpg and then use the Abode script (edited to show .jpg in the filename column) to embed metadata in them.)
  2. Save derivative file(s) to subfolder "derivatives/awaiting_upload"
  3. Update Shared Shelf "Item Status" to  "4-derivative-files-made"
  4. Move Masters  to "masters/4-awaiting-upload" subfolder 



Ingest of batch metadata & manual upload of derivatives

  1.  Follow guidelines to add records to IR using metadata batch processing tool, which brings just the metadata into the IR
  2. Add bitstreams (jpg, jp2, pdf) manually to each item in IR. Describe each as such:
    1.  *.jp2 = Zoom and pan version
    2. *.jpg = Download image
    3. *.pdf = Use brief description of contents (e.g. PDF of program)

3. Upload derivative files and move the server copy of the derivatives to "derivatives/final". Once they are confirmed to be safely in the IR, they can be deleted from the server.

5. Update Shared Shelf with "Item Status" of "5. Basic item uploaded to IR"


Ingest of Master TIFFs

1. If Masters can be uploaded at the same time as the derivatives (confirm space allocation with programmers) upload the masters to the IR and move the server copy of the masters to "masters/5-uploaded" where they will stay until it is confirmed by DSS staff that they are safely backed up, at which point they are deleted off the project server. If Masters cannot be uploaded at time of the item ingest, keep the files in "masters/awaitingupload" until such time as they can be ingested.


After the basic items are ingested

Once the basic items are in the IR, archivists include a summary of the monthly activity and Cataloging can begin to work on enhancements such as subject headings and name authority, as outlined in the schedule.

When Cataloging provides spreadsheets to WRC on enhancements for records:

1.) check the system ID numbers and use index file to update if necessary (they should all begin with a 5 - this will likely only be necessary through 2/2013). This requires a VLOOKUP formula where Column A = identifier.digital, Column B = blank, ready for the formula to get the current ID, Column C = ID (as provided in the enhancements spreadsheet). In cell B2 "insert formula" VLOOKUP, first part = A2, second part requires toggling to the index file and selecting all the data in it, third part = 2, fourth part = FALSE. Use the formula and copy it down to get the values.

2.) break out the heading "Rice University--History" into dc.subject.lcsh[en_US] and the remaining terms into dc.subject.lcsh[eng] - the language tags are internal to DSpace (not seen by public) and are only meaningful for internal purposes such as easily sorting the records by which have been enhanced with subject headings and which have not.


Supplemental steps

  • Map to other collections on case by case basis - WRC
  • Update WRC's digi-object list with the DSpace handle - WRC


Return to the Top


List of digitization guidelines for this project



Return to the Top

Comments (0)

You don't have permission to comment on this page.