• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


batch process metadata

Page history last edited by Amanda Focke 10 years, 6 months ago Saved with comment

These guidelines were designed specifically for the Rice University Historical Images collection


Workflow steps: Batch process metadata, manually add files


Download file >Filter data >QC>Prepare data>Save to CSV>Import & review>Post>Add files


  1. Download Shared Shelf spreadsheet and open in Excel

  2. Filter : Determine items ready for ingest by applying filter to all records to select only those of "Item Status" = "2. Basic Metadata Entered"

  1. Transfer filtered metadata to new sheet

    • select filtered rows of data

    • Find > Goto special > select option : visible only cells

    • copy selection

    • create new spreadsheet

    • paste

    • save new file (e.g. batch01Nov2011.xls)


  2. Perform quality control checks:

    • Check for duplicate items (WRC#)

      1. Run pivot table test : count should be one per WRC# (confirms no duplicate records in spreadsheet) – see below for steps on creating pivot tables

      2. Search DSpace by wrc# (to confirm items have not already been ingested in any collection)

      3. double check the internal WRC spreadsheet of digi-object searching by words to ensure the item does not also exist under a different wrc#


    • Run spell check

    • Review capitalization and punctuation (i.e. DCMI type, format.medium, no periods after titles, etc)

    • No abbreviations

    • Multiple values separated by double rod ||

    • Check metadata is populated in required fields (i.e. rights, source)



Since data may be changed outside of Shared Shelf, be aware that once item is ingested, DSpace becomes the primary source (i.e. Shared Shelf may not reflect all changes)

     5. Use this clean data for photometadata emedding, proceed with creating derivatives

  1. Prepare data for ingest:

    • delete unnecessary columns

    • update metadata headers –copy from spreadsheet

    • insert columns for ID and collection

    • Assign collection # per sub collection and visually confirm – see table below, copy to spreadsheet

      • tip: sort by dc.subject and visually compare collection id to subject term

    • Add boilerplate metadata to all records

      • dc.publisher : Rice University

      • dc.subject.lcsh : Rice University -- History (broad term. Narrower or more specific terms may be added later)

    • check total items is no more than 20 (the maximum number of records that may be edited through the web user interface (XMLUI)

  1. Save Excel file > Open Excel in Open Office Calc > Save as CSV encoded with UTF-8

  2. Goto Dspace and run batch import, visually review changes (ie collection name matches subject, diacritics, etc) if ok then post changes.

  3. Attach image files per new record. Tip: Search by WRC# (Edit this item > edit bitstreams > add file)

  4. Update status in Shared Shelf



Note:  for metadata work copy lists directly from excel spreadsheet located on fonlibstor project folder. Below tables are provided for easy reference only.


Element Headers



















Collection Ids




Key documents










World Tour: postcards


Boilerplate metadata

  • dc.publisher : Rice University

  • dc.subject.lcsh : Rice University -- History (broad term. narrower or more specific terms may be added later)




  • Copyright, access and use statement embedded in object files

  • Verify no duplicate items

  • Confirm metadata is complete and follows input rules (abbreviations, punctuation, etc)

  • Confirm items are ingested per sub-collection (e.g. archival sub folder : People, Places, Events, Sports)


Steps for using pivot table tool

Create pivot table

  • Select Cell A1

  • Go to INSERT menu

  • From the Tables section, press Pivot Table button

  • Create PivotTable” Wizard will appear (it should automatically recognize the range of data)

  • Press Ok. This will create a new tab.

  • (optional) Label tab : Table


Define layout for pivot table

  • On the far right, Pivot table options will appear

  • From the “PivotTable Field List” (at top) , Drag dc.identifier.digital field to Row labels area (at bottom left)

  • Drag dc.identifier.digital field (from top) to Values area (at bottom right)

  • Values should automatically set to count function

  • These actions will update the table on the far left, so that the pivot table now displays the number of occurances (count) dc.identifier.digital element (or WRC#)



  • All calculated values should equal 1 : any value greater than 1, contains duplicate WRC# remove duplicates from spreadsheet



Comments (0)

You don't have permission to comment on this page.