Summary
-
Validate all required fields are populated
-
Validate standard syntax is used
-
General Typography
-
Validate digital files follow internal guidelines
Check list
1) Required fields:
-
dc.date.issued = numeric values only, follows ISO standard ( e.g. YYYY or YYYY-MM or YYYY-MM-DD)
-
dc.date.digital = YYYY
-
dc.digitization.specifications : includes statement about masters
-
dc.subject = Term matches Dspace collection (values: Events, Key Documents, People, Places, Sports)
-
dc.subject.lcsh = Rice University -- History (Boilerplate)
-
dc.publisher = Rice University (Boilerplate)
-
dc.rights = only 3/4 common statements (ensure no license.txt was uploaded - use curation task tool)
-
dc.rights.uri = link only
-
dc.source.collection = includes location of physical item in archive
-
dc.identifier.digital = wrc# ( check there are no duplicate values)
-
dc.type.dcmi = DCMI terms are capitalized and use singular form (Text, Image, etc)
-
dc.type.genre = AAT terms, generally not capitalized and use plural form
tip: xls pivot table / counta() formula
2) Non-required fields
-
dc.language.iso = 3 character code for text-based resources. Do not use for resources which are solely images.
-
dc.date.original = textual note or long version of date (do not use acronyms eg. “circa” instead of “ca”). if numeric date is used (eg. 11/5/2012) move to dc.date.issued and convert to ISO std.
-
Review any non-AP fields to ensure follows general IR application profile [not very likely scenario but just in case..] eg if dc.relation, should use qualifier such as dc.relation.isPartof? ; no depreciated usage eg if dc.creator.author should be dc.contributor.author
3) confirm logic of dc.type.genre to dc.type.dcmi and investigate any that do not align
Future possibility: also compare to bitstream format
example:
dc.type.dcmi
|
dc.type.genre
|
bitstream format a
|
Image
|
photographs, etc
|
*.jpg / *jp2
|
Text
|
pamphlets, etc
|
*.pdf
|
(a) May use DSpace curation task tool (eg compare count of format bistreams to dc.type)
➨ Important validation step for long term preservation activities
4) General typography checks
-
no periods at end of titles, names, subjects
-
expand non-common abbreviations (check in description, source, relation fields)
-
spell check textual fields (eg description, source, relation, subjects)
-
no urls except for rights.uri, relation links to other DSpace items, or finding aids
Digital files
-
Confirm masters has been ingested for all items.
-
For items with no master files, confirm this is documented in dc.digitization.specifications as a separate value (double pipe ||)
-
Confirm folder location of digital files on the server matches workflow status (e.g. if item is online, files should be in “Final” folder)
-
Verify file types follow IR supported formats - send in notice to DSS of any new formats for inclusion in the file format registry
-
Verify metadata has been embedded in all files (*.tif, *.jp2, *.jpg)
-
Confirm best practices in file naming have been applied (e.g. WRC designated number, no spaces or special characters, if multiple files per object confirm consistent suffix numbering)
-
Confirm no duplicate WRC numbers (these should be unique)
-
Verify masters and derivatives are scanned at project specs (ppi, uncompressed for tiffs, color profiles,etc) or at IR standards
-
Confirm proper file type per content type was used (see item#3 under Checklist section above)
STEPS for digital files QC
A) Run Curation task: count masters
use as baseline to confirm only uploading new masters.
B) run exiftool -csv command on files
Exiftdata Checklist
-
uncompressed
-
no odd color profiles
-
filenames
-
resolution >300ppi (if lower, maybe due to source material such as grayscale images. this should be documented in digi.specs)
-
mimetype
-
embedded descriptive metadata (if none in TIFF, likely need to recreate jpegs)
related guides: Using exiftool for the QC process
C) Export Dspace Metadata
Metadata Checklist
Comments (0)
You don't have permission to comment on this page.