7 Common PDF Remediation Mistakes You Must Avoid

PDF documents are highly preferred by authors prioritizing accessibility because of their portability and accessibility benefits. PDFs offer a consistent visual experience for readers, regardless of the platform from which the file is accessed. Furthermore, PDFs have robust support in the world of digital accessibility compared to other file types. PDF provides the advantage of accessibility not being tied to the document’s appearance.

However, PDFs can fall short if not properly created, making them inaccessible. As authors remediate or make the files accessible, there are some common mistakes that even the newest remediators can be on the lookout for and quickly master.

Reading Order

In a PDF file, the reading order is the most critical component related to how the content is shared.

What if the conclusion of the document is read before the introduction? What if a story’s plot twist is uncovered before meeting the characters?

Content can change dramatically based on the order in which it is shared, and the reading order is determined by the order of the tags in the Tags tree. It is this order that assistive technologies (AT), such as screen readers, navigate through to read the content.

This interface is hidden behind the scenes and is up to remediators to verify. Sometimes, a file comes through logically and in the correct order straight from generation, but that is increasingly unlikely and typically requires manual correction.

To check for accuracy, a remediator should navigate, tag-by-tag, through the tree and ensure that the order in which the tags are assembled is the intended order in which the content should be read.

Remediation difficulty can vary based on the level of correction needed, but overall, verifying reading order is a critical step to establishing a compliant and truly accessible document.

Correct Heading Usage

Another one of the main offenders of inaccessible documents is incorrect, inconsistent, or nonexistent heading usage. Headings help readers navigate the document, stay organized, and give someone a clear outline of the document’s structure.

Without headings, or perhaps even more confusingly, with incorrect headings, a reader may wonder where in a document they are, have trouble finding the specific content they are interested in, and be unable to make sense of the flow and organization of the document.

A basic and helpful tip for assigning proper headings is to think of an outline of the document. The document title should be tagged as a Heading 1 (H1 in PDF). The subsections of the document should be tagged as Heading 2s (H2), and the subsections of the H2s should be tagged as Heading 3s (H3). The same section and sub-section organization should be used throughout the entire document.

Another way of thinking of this is that headings are a leveling system, not a numbering system. For example, you might have several H2 tags before you have your first H3 tag.

Using headings properly is a crucial step in creating accessible documents.

Tagging the Correct Content

Our role as remediators is to tag the content on the page so that a reader using assistive technology can access the same content as a sighted reader, for example.

Another component of this is that the tags must be accurate based on the document content. Another piece of this puzzle is that some content can be untagged and, therefore, removed from what assistive technology reads.

Examples of this are page numbers, decorative page borders, or any running header or footer that repeats on every page. If someone is reading a 300-page novel, they don’t need to be read – by assistive technology – the page number at the bottom of every page – it can be unnecessary and even disruptive, depending on when the numbers are read.

As a result, remediators must take appropriate steps to untag content like this.

Untagging, also known as artifacting, this unnecessary content is an important skill to master to give AT users the best possible experience with a PDF.

Alternative Text

Alternative text (Alt text) is a textual description of an image shared with AT users when that content is read. Not only having Alt text but, even more importantly, having accurate Alt text for all tagged graphics is a requirement under all accessibility standards and a common shortcoming for newer remediators.

If a graphic is contextually important, but assistive technology cannot share that information, a user is missing out on valuable content, and the file is therefore inaccessible.

A related issue is that some Alt text is horribly wrong. Some authoring tools actually auto-populate the Alternate text field, which is typically incorrect and poses other issues.

To avoid this common remediation mistake, verify that every Figure tag is given accurate and detailed Alt text.

Of course, you may wonder, “How detailed?” A helpful rule is to make the Alt text as concise as possible while remaining as thorough as possible so as not to drop any needed information.

Interactive Elements

Putting interactive content into PDF files is extremely common. Links and forms are a few of the most common examples, which, if not handled properly, can be confusing and frustrating for readers using assistive technology.

In short, AT uses descriptions such as Alt text, tooltips, or an annotation’s Contents to tell the reader exactly what happens if they select a specific link or how to fill a form field best.

It might be an external link that could open a webpage. It might be an internal link, which could bring them to another point in that same document. It could trigger something outside of the PDF itself, such as opening a pre-addressed email to someone.

The worst-case scenario is that the link happens to be broken, and selecting it doesn’t do what it was expected to do.

Regarding a fillable form, this text could tell a reader what the question is asking or how to answer a question.

Having these links and forms in PDFs is a great, efficient way to let readers perform certain tasks. Still, if these interactive elements lack the appropriate accessibility requirements, the entirety of the document is incompliant.

Table Usage & Formatting

Tables can be an incredible way of displaying information while showing relationships between specific data sets. Scientific reports, for example, rely on data tables to articulate data-supported concepts. However, if tables are not used properly, they can leave readers confused and lost in a document.

One of the most common table-related issues is authors using tables for presentation purposes. The main offender is an author who does not know how to design columns in their document, so to workaround this shortcoming, they put the content into a hidden table with two or more columns. Visually, this looks acceptable, but the tagging of the document will expose Table tags, leaving a reader using AT confused as to what data is trying to be shared.

In short, authors should refrain from using a table structure to format content.

We refer to these improper uses as presentation tables and they will require involved manual remediation to fix. Tables can, of course, also account for other issues, such as cells spanning incorrect heights and widths, but to ensure a document is accessible, be sure only to use data tables to convey a data-driven relationship.

If the use of a data table is correct, meaning the data does convey a relationship and a table is an appropriate way of sharing it, other specifications must be met. For example, cells that categorize and relate to entire columns or rows must be appropriately marked as header cells.

This is one of the many potential issues that must be addressed when dealing with properly identified and structured tables.

Metadata

Accurate metadata is critical to a PDF’s ability to be accessed and gives valuable context as a reader navigates to and through it. A PDF’s title, for example, tells the reader what file they are opening and ensures that they will not be surprised when the file they open and begin reading is not what they expected.

Additionally, there is a setting within a document’s properties that ensures that the document’s title, rather than the file name, is read by assistive technology.

This is increasingly important, as some file names are highly coded sets of numbers and letters rather than a concise naming convention that reflects the document’s content.

A document’s language is also set in the metadata. If a document is written in one language but set to read in another, the document will be inaccessible.

Author, subject, keywords, and an indication that the file has been tested for accessibility are metadata properties that offer essential information to readers and should not be ignored.

While not all of those fields are required under each accessibility standard, all offer helpful information to audiences and should be properly verified in the context of accessibility.