The Digital Scholarship Services (DDS) team wishes to provide support for as many file formats as possible. Over time, items stored in Rice University's Digital Scholarship Archive (RDSA), found at scholarship.rice.edu, will be preserved as is, using a combination of time-honored techniques for data management and best practices for digital preservation. As for specific formats, however, the proprietary nature of many file types makes it impossible to make guarantees. Put simply, our policy for file formats is this:
-
Everything put in RDSA will be retrievable.
-
We will recognize as many files' formats as possible.
-
We will support as many known file formats as possible.
When a file is uploaded to RDSA, we assign it one of the following categories to note the level of support for its format:
-
Supported: RDSA fully supports the format.
-
Known: RDSA can recognize the format, but we cannot guarantee full support.
-
Unsupported: RDSA cannot recognize a format; such formats are listed as "application/octet-stream", or Unknown.
By "support", we mean "make usable in the future, using whatever combination of techniques (such as migration, emulation, etc.) is appropriate given the context of need." For supported formats, we might choose to bulk-transform files from a current format version to a future one, for instance. But we can't predict which services will be necessary down the road, so we'll continually monitor formats and techniques to ensure we can accommodate needs as they arise.
In the meantime, we can choose to "support" a format if we can gather enough documentation to capture how the format works. In particular, we collect file specifications, descriptions, and code samples, and make those available in the Format Reference Collection below. Unfortunately, this means that proprietary formats for which these materials are not publicly available cannot be supported in scholarship.rice.edu. However, we will still preserve these files, and we will provide you with guidance on converting your files into formats we do support. It is also likely that for extremely popular but proprietary formats (such as Microsoft .doc, .xls, and .ppt), we will be able to help make files in those formats more useful in the future simply because their prevalence makes it likely tools will be available. Even so, we cannot guarantee this level of service without also having more information about the formats, so we will still list these formats as "known", not "supported."
What to do if your format isn't recognized
We understand that there are always more formats to consider, and we would appreciate your help in identifying and studying the suitability of support for formats you care about. If we can't identify a format, Rice University's Digital Scholarship Archive will record it as "unknown", or "application/octet-stream," but we would like to keep the percentage of supported format materials in the digital archive as high as possible. Don't hesitate to contact us if you have any questions or concerns. Please email us with any questions at cds [at] rice [dot] edu.
DSpace Format Reference Collection
The table below describes how the Rice University's Scholarship Digital Archive (RSDA) supports each listed file type. MIME type is the Multipurpose Internet Mail Extensions (MIME) type identifier. For more information on MIME, see the MIME RFCs or the MIME FAQ. Description is what most people use as the name for the format. Extensions are typical file name extensions (the part after the dot, for example, the extension for "index.html" is "html"). These are not case-sensitive in scholarship.rice.edu, so either "sample.XML" or "sample.xml" will be recognized as XML. The Level is scholarship.rice.edu's support level for each format:
-
Supported: RDSA fully supports the format.
-
Known: RDSA can recognize the format, but we cannot guarantee full support.
-
Unsupported: We cannot recognize a format; these will be listed as "application/octet-stream", or Unknown.
MIME type
|
Description
|
Extensions
|
Level
|
application/marc
|
MARC
|
marc, mrc
|
supported
|
application/mathematica
|
Mathematica
|
ma
|
known
|
application/msword
|
Microsoft Word
|
doc, docx
|
known
|
application/octet-stream
|
Unknown
|
(anything not listed)
|
unsupported
|
application/pdf
|
Adobe PDF
|
pdf
|
supported
|
application/postscript
|
Postscript
|
ps, eps, ai
|
supported
|
application/rdf+xml; charset=utf-8
|
RDF XML
|
rdf
|
known
|
application/sgml
|
SGML
|
sgm, sgml
|
known
|
application/vnd.ms-excel
|
Microsoft Excel
|
xls, xlsx
|
known
|
application/vnd.ms-powerpint
|
Microsoft Powerpoint
|
ppt, pptx
|
known
|
application/vnd.ms-project
|
Microsoft Project
|
mpp, mpx, mpd
|
known
|
application/vnd.visio
|
Microsoft Visio
|
vsd
|
known
|
application/wordperfect5.1
|
WordPerfect
|
wpd
|
known
|
application/x-dvi
|
TeXdvi
|
dvi
|
known
|
application/x-filemaker
|
FMP3
|
fm
|
known
|
application/x-latex
|
LateX
|
latex
|
known
|
application/x-photoshop
|
Photoshop
|
psd, pdd
|
known
|
application/x-tex
|
TeX
|
tex
|
known
|
audio/x-aiff
|
AIFF
|
aiff, aif, aifc
|
supported
|
audio/basic
|
audio/basic
|
au, snd
|
known
|
audio/x-mpeg
|
MPEG Audio
|
mpa, abs, mpeg
|
supported
|
audio/x-mp3
|
MP3 Audio
|
mp3
|
supported
|
audio/x-pn-realaudio
|
RealAudio
|
ra, ram
|
known
|
audio/x-wav
|
WAV
|
wav
|
supported
|
image/gif
|
GIF
|
gif
|
supported
|
image/jp2
|
JPEG2000
|
jp2
|
supported
|
image/jpeg
|
JPEG
|
jpeg, jpg
|
supported
|
image/png
|
PNG
|
png
|
supported
|
image/tiff
|
TIFF
|
tiff, tif
|
supported
|
image/x-ms-bmp
|
BMP
|
bmp
|
known
|
image/x-photo-cd
|
Photo CD
|
pcd
|
known
|
text/html
|
HTML
|
html, htm
|
supported
|
text/plain
|
Text
|
txt
|
supported
|
text/richtext
|
Rich Text Format
|
rtf
|
supported
|
text/xml
|
XML
|
xml
|
supported
|
video/mpeg
|
MPEG
|
mpeg, mpg, mpe
|
supported
|
video/quicktime
|
Video Quicktime
|
mov, qt
|
known
|
text/csv
|
Comma separated values
|
csv
|
Supported
|
text/tab-separated-values
|
Tab separated values
|
tab
|
Supported
|
Reference: this format policy is adopted from the DSpace community documentation.
Comments (0)
You don't have permission to comment on this page.