• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Accessible PDF for Libstaff

Page history last edited by Monica 3 years, 1 month ago Saved with comment

PDF Accessibility Checklist and Techniques for library staff

 

 

INTRO

The purpose of these guidelines is to provide practical steps to create accessible PDFs in compliance with accessibility standards US Section 508 and WCAG - level A. The library is usually not the author of the content of PDFs, but rather creates PDFs from scans of physical materials (eg. books, articles, historical documents, etc.) or performs conversion from other formats (e.g. Microsoft Word, Powerpoint, etc). These guides are intended to cover common use cases for library staff.

 

CHECKLIST

 

Recommended for Level A compliance

  • Perform OCR to provide actual text (see notes a, b)
  • Set the default language for entire PDF
  • Embed title to document metadata
  • Set initial view to show document title in tab
  • Ensure Fonts used allow characters to be extracted to text
  • Ensure logical reading order and document structure, “tagged PDF” (see note f)
  • Apply text alternatives for images (see notes d, e)
  • No reliance on color or sensory characteristics alone to convey meaning (see note g)
  • Do not use Security that interferes with assistive technology (see note c)

 

 

NOTES

a) For born-digital content, make use of accessibility settings in the source software prior to generating PDF format as this may provide better results and less re-editing. (see Microsoft Word and InDesign ).

 

b) Image-Only PDFs -- For handwritten archival documents (e.g. letters, journals, etc.), where OCR is not possible, only metadata, language and document display properties are required for accessibility compliance. However, you must flag pages in the image-only PDFs as null images. 

 

You can batch set the "decorative image" without having to manually set it for each page by using an adobe acrobat "action". Download action here: Set Alternative Text.sequ

Just run this action on the PDF to set all images to  null value or "decorative image".  If your document also contains mixed content, some printed text and/or  non-decorative images., then recommend to run this action first and then manually OCR or add alt image text for those sections/pages.

 

To install action:

  1. Choose Tools > Action Wizard.
  2. In Action Wizard menu (across top), selection Manage Actions button
  3. In the Manage Actions pop up box, on the right towards the bottom, select "import" button and  then navigate to the downloaded action  to install and press close

 

Note: a transcription of the handwritten text should be included as a separate file from the PDF. See WRC Transcription Guidelines for best practices on creating transcriptions.

 

 

c) An example of Security that interferes with assistive technology is the PDF/a standard. Currently this standard prevents exporting text, which prevents screen readers from “reading” the PDF as well as other general usability features such as annotation and highlighting. Also remove any passwords.

 

d) Hiding decorative images or other non-content artifacts.

 

From W3C Techniques for WCAG 2.0, PDF4: Hiding decorative images with the Artifact tag in PDF documents: purely decorative images in PDF documents can be marked so that they can be ignored by Assistive Technology… [such] artifacts are generally graphics objects or other markings that are not part of the authored content. Examples of artifacts include page header or footer information, lines or other graphics separating sections of the page, or decorative images.

 

For scanned documents, it is not uncommon to have “noise” output during the OCR process. These artifacts are not part of the actual content, such as blank pages/sections, scans of paper edges, etc. These sorts of “images” should be ignored or set to background so they are not “read-out” to screen readers.

 

Use null (empty) tag for these cases (i.e. enter double quotes alt=””).

 

e) Alternative text

 

To set alt text for graphics in a PDF file, In acrobat GOTO Tools > Accessibility > Set Alternate Text.

 

There is no official limit on the alt text field, however best practices recommend limits from 120-250 characters.

 

From WebAIM Web Accessibility Gone Wild :alternative text must convey the content and functionality of an image, and is rarely a literal description of the image (e.g., “photo of cat”). Rather than providing what the image looks like, alternative text should convey the content of the image and what it does.

 

BCcampus Open Education,Accessibility Toolkit – 2nd Edition, Images Chapter provides guidelines for creating appropriate alternate text which include:

  • For relatively simple images (e.g., photographs, illustrations), try to keep your text descriptions short. You should aim to create a brief alternative
     
  • Leave out unnecessary information. For example, you do not need to include information like “image of…” or “photo of…”; assistive technologies will automatically identify the material as an image, so including that detail in your alternative description is redundant.
  • Avoid redundancy of content in your alternative description. Don’t repeat information that already appears in text adjacent to the image.

 

It is not necessary to use alt text if the image is identified and described by surrounding text, either in a caption or nearby paragraphs. Sources: When to Add Alternative Text Descriptions for Image, Rich Text Editor Accessibility Guidelines and Decorative Images, WCAG Web Accessibility Tutorials.

 

Additional resources:

 

Note: Any Alt text will also export along with visible text of the PDF document.

 

f) Document structure ("tags") and reading order

 

From Adobe Accessibility Overview: Document structure tags in a PDF define the reading order and identify headings, paragraphs, sections, tables and other page elements. The tags structure also allows for documents to be resized and reflowed for viewing at larger sizes and on mobile devices.

 

Recommend using Acrobat’s Autotag document feature to ensure logical document structure. Simple OCR will not provide a tagged PDF.

 

For born-digital items, e.g. If PDF was created from a MS word or powerpoint,  make use of accessibility settings in the source software prior to generating the PDF as this may provide better results and less re-editing (see Microsoft Word and InDesign ). 

 

Recommend to manually review reading order when working with complex layouts (e.g. newspapers) or documents converted from other formats (eg. powerpoint, word, etc.)

 

g) Color contrast can not be edited in a PDF file directly, this is characteristic of the original document and may not be modified for scanned materials. If authoring born-digital materials please follow recommendation: For normal text WCAG 2.0 requires contrast ratios of 4.5:1 (AA) or 7:1 (AAA). For large text WCAG 2.0 requires contrast ratios of 3:1 (AA) or 4.5:1 (AAA).

 

Resources for assessing color and contrast:

 

 

Enhanced

The following criteria are considered optional for library managed PDFs. These are either Level AA features and thereby not required per Rice University’s Strategic Plan for Accessibility(p6) at this time or features considered unduly difficult to process at large scale.

 

  • Creating bookmarks in PDF documents.

Note: An exception to this is for larger sized documents, such as scanned Thesis where multiple physical volumes are combined into one file. Bookmarking may help users navigate extra long texts, so apply at your discretion.

  • Specifying change of language for a passage or phrase within a larger document
  • Use of list tags for lists in PDF documents
  • Use of table elements for table markup
  • Manually marking content with heading tags
  • Provide alt text for the Link elements in PDF documents.

Note: sometimes simple OCR will break links in the text, however Adobe’s Autotag technology will automatically create clickable links, retaining the functionality of this feature. Though some sources recommend replacing URL syntax with natural language text, this conflicts with preservation best practices which recommends keeping the full URL for digital provenance. The idea being that at least the original url syntax might provide some information to a future user if the site suffers from link rot over time.

  • Use of Interactive forms is not a typical use case for Library-generated or managed PDF documents. It is recommended to use Web based forms for gathering data.

 

 

 

Techniques

 

The following processes use Adobe Acrobat Pro DC software for checking and applying accessibility standards. See online documentation Create and verify PDF accessibility (Acrobat Pro) for general steps.

 

Manual

Note: These instructions assume the PDFs do not contain forms and are scans of archival materials which were not originally created by the person creating the PDFs (and therefore it is not appropriate to edit the document to change fonts, styles or color contrasts). You will need to tag the PDF (that is add tags for <headings>, <body>, <image> etc.) These tags do not effect the visual document but rather provide underlining structure to the PDF so that the PDF is "readable" to a assistive technology such as a screen reader.

 

  • Create a PDF file in Acrobat, save with no security such as password protection or PDF/A format.
  • Tools - Enhance - Recognize text 
    • Tip: select primary language 
  • Set the Initial View settings for the Document Title to show, by going to File-Properties - Initial View tab

    • Window Options - Show = Document Title
    • Tip: Control + D opens the document properties window 
  • Set the Reading Language by going to File-Properties-Advanced tab
    • Reading options - Language - select the language
  • Embed descriptive metadata by going to File-Properties-Description tab - Additional Metadata
    • Enter Title (and Author if known) 
    • for Woodson collections also include: 
      • Description = identifier 
      • Copyright status = copyrighted or public domain
      • Copyright notice = copyright statement, such as from dc.rights 
      • Source = Rice University 
  • Run accessibility tools by going to Tools - Accessibility

    • Autotag document

      • If PDF was created from MS word or powerpoint, skip this step. Instead make use of accessibility settings in the source software prior to generating the PDF as this may provide better results and less re-editing (see Microsoft Word and InDesign ). 

    • Set alternate text (Acrobat will detect figures in the document and display associated alternate text / a window where you may add or edit such text)

    • Tables - are difficult to encoded directly in PDFs. May use advance OCR tools such as Omni Page Pro to add headers to tables.

    • "Logical reading order - needs manual check" -- May use sampling technique for longer documents. For example manually review the reading order for up to 10 pages, and do sample of pages for any pages after that.

    • "Color Contrast, needs manual check" -- Ignore. For archival documents, primary goal is to provide document that remains faithful to original do  not edit the color contrast or fonts used in generated

 

Automated

 

In Adobe Acrobat, Actions are used to automate repeating tasks. These can be customized to suit your specialized needs. This is especially helpful when working with large collection of similar documents.

 

1) Default Action

In Acrobat Pro version, an action “Make Accessible” is preinstalled and walks you through the steps required to make a PDF accessible. For complete instructions on how to make documents accessible and repair the accessibility tag structure of a document refer to the Adobe Acrobat Pro DC Accessibility Repair Workflow guidelines.

 

2) Customize Adobe Acrobat Action for batch processing PDFs

 

Example: Batch Make Accessible.sequ (download)

 

When to use

Best used for group of homogeneous PDFs, such as Shepherd Performances Programs. For example in the Shepherd School of Music collection, there are thousands of PDFs of simple structure/layout, primarily textual content and any images are typically decorative.

 

Notes before starting

As a prerequisite, first OCR, optimize and embed metadata to your PDFs. You can import this action and further customize as need. Download action Batch Make Accessible.sequ here. Then in Acrobat, Choose Tools > Action Wizard>In the secondary toolbar, click Manage Actions > In the Manage Actions dialog box, click Import.

 

Action explained

This action can be used for one or more files or pointed to a folder of PDFs. It will auto-process key accessibility steps without prompting the user allowing for quick processing. When action is complete, review report for confirmation there are no errors. Steps included are:

 

  • Select file or folder of files
  • Set Reading Language. Default is set to English
  • Set Open Options to display Title

  • Autotag Document (MUST occur before setting alt text)
  • Set Alternate Text for any images or artifacts to null value
  • Save to same filename

TIP: for large batches, acrobat may freeze up. In such cases, change security settings: Edit preferences > Security (Enhanced) > Uncheck protected view and enter directory path under privilege locations.

 

You can batch confirm certain accessibility settings via exifdata -CSV command. In the output, accessible PDFs should have the following metadata:

Language en

TaggedPDF Yes

Title any value

 

Tools for compliance testing

  • Adobe Acrobat Pro DC, Desktop software available to Fondren Staff.

An online validator which uses a public URL and tests accessibility - may be useful as a double check for collections with PDFs in the institutional repository -- to be tested in 2019. Free version limited to 10 webpages. Enter the direct PDF URL not the website where the website is hosted.

Free online tool to perform accessibility checks and allows edits. Limited to 5MBs. Upload PDF make recommended edits and download compliant PDF.

Desktop download. PDF Accessibility Checker (PAC) evaluates the accessibility of PDF files according to ISO-/DIN-Standard 14289-1 (PDF/UA) by using the Matterhorn Protocol. It checks 107 criteria that can be checked automatically. PAC is a free checking tool of the foundation «Access for all»: www.access-for-all.ch

 

Related Resources

 

 

 

 

 

 

7

Comments (1)

Monica said

at 8:45 am on Jun 25, 2021

You don't have permission to comment on this page.