• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Finally, you can manage your Google Docs, uploads, and email attachments (plus Dropbox and Slack files) in one convenient place. Claim a free account, and in less than 2 minutes, Dokkio (from the makers of PBworks) can automatically organize your content for you.

View
 

batch process metadata

Page history last edited by Amanda Focke 7 years, 5 months ago Saved with comment


These guidelines were designed specifically for the Rice University Historical Images collection

 

Workflow steps: Batch process metadata, manually add files

 

Download file >Filter data >QC>Prepare data>Save to CSV>Import & review>Post>Add files

 

  1. Download Shared Shelf spreadsheet and open in Excel

  2. Filter : Determine items ready for ingest by applying filter to all records to select only those of "Item Status" = "2. Basic Metadata Entered"

  1. Transfer filtered metadata to new sheet

    • select filtered rows of data

    • Find > Goto special > select option : visible only cells

    • copy selection

    • create new spreadsheet

    • paste

    • save new file (e.g. batch01Nov2011.xls)

       

  2. Perform quality control checks:

    • Check for duplicate items (WRC#)

      1. Run pivot table test : count should be one per WRC# (confirms no duplicate records in spreadsheet) – see below for steps on creating pivot tables

      2. Search DSpace by wrc# (to confirm items have not already been ingested in any collection)

      3. double check the internal WRC spreadsheet of digi-object searching by words to ensure the item does not also exist under a different wrc#

         

    • Run spell check

    • Review capitalization and punctuation (i.e. DCMI type, format.medium, no periods after titles, etc)

    • No abbreviations

    • Multiple values separated by double rod ||

    • Check metadata is populated in required fields (i.e. rights, source)

       

note:

Since data may be changed outside of Shared Shelf, be aware that once item is ingested, DSpace becomes the primary source (i.e. Shared Shelf may not reflect all changes)

     5. Use this clean data for photometadata emedding, proceed with creating derivatives

  1. Prepare data for ingest:

    • delete unnecessary columns

    • update metadata headers –copy from spreadsheet

    • insert columns for ID and collection

    • Assign collection # per sub collection and visually confirm – see table below, copy to spreadsheet

      • tip: sort by dc.subject and visually compare collection id to subject term

    • Add boilerplate metadata to all records

      • dc.publisher : Rice University

      • dc.subject.lcsh : Rice University -- History (broad term. Narrower or more specific terms may be added later)

    • check total items is no more than 20 (the maximum number of records that may be edited through the web user interface (XMLUI)

  1. Save Excel file > Open Excel in Open Office Calc > Save as CSV encoded with UTF-8

  2. Goto Dspace and run batch import, visually review changes (ie collection name matches subject, diacritics, etc) if ok then post changes.

  3. Attach image files per new record. Tip: Search by WRC# (Edit this item > edit bitstreams > add file)

  4. Update status in Shared Shelf

 

Tables

Note:  for metadata work copy lists directly from excel spreadsheet located on fonlibstor project folder. Below tables are provided for easy reference only.

 

Element Headers

dc.identifier.digital

dc.subject

dc.title

dc.date.issued

dc.date.original

dc.type.dcmi

dc.description

dc.description.abstract

dc.source.collection

dc.contributor.author

dc.contributor.photographer

dc.type.genre

dc.rights

dc.rights.uri

dc.date.digital

dc.digitization.specifications

 

 

Collection Ids

1911/61408

Events

1911/62306

Key documents

1911/61406

People

1911/61407

Places

1911/61409

Sports

1911/61410

Various

1911/70176

World Tour: postcards

 

Boilerplate metadata

  • dc.publisher : Rice University

  • dc.subject.lcsh : Rice University -- History (broad term. narrower or more specific terms may be added later)

 

 

Checklist

  • Copyright, access and use statement embedded in object files

  • Verify no duplicate items

  • Confirm metadata is complete and follows input rules (abbreviations, punctuation, etc)

  • Confirm items are ingested per sub-collection (e.g. archival sub folder : People, Places, Events, Sports)

 

Steps for using pivot table tool

Create pivot table

  • Select Cell A1

  • Go to INSERT menu

  • From the Tables section, press Pivot Table button

  • Create PivotTable” Wizard will appear (it should automatically recognize the range of data)

  • Press Ok. This will create a new tab.

  • (optional) Label tab : Table

 

Define layout for pivot table

  • On the far right, Pivot table options will appear

  • From the “PivotTable Field List” (at top) , Drag dc.identifier.digital field to Row labels area (at bottom left)

  • Drag dc.identifier.digital field (from top) to Values area (at bottom right)

  • Values should automatically set to count function

  • These actions will update the table on the far left, so that the pivot table now displays the number of occurances (count) dc.identifier.digital element (or WRC#)

 

Test

  • All calculated values should equal 1 : any value greater than 1, contains duplicate WRC# remove duplicates from spreadsheet

 

 

Comments (0)

You don't have permission to comment on this page.