robot hands clicking keys on keyboard

Ashish Tiwari

Can Artificial Intelligence Make PDFs Accessible?

A PDF document is considered accessible if it can be used/accessed by everyone, including people with disabilities. There are several accessibility standards to test against that help ensure PDF accessibility.

Most PDFs published on websites are not accessible. This is not necessarily due to the negligence of web designers and authors; often, people don’t even realize that the PDFs they upload on their website are not accessible or standards-compliant. 

However, the law requires that all public-facing websites must be accessible (including documents like PDFs) to everyone, so when a website owner is served with an accessibility lawsuit, they may look for quick fixes.

In a time-sensitive situation involving possible legal action, organizations may look to artificial intelligence to speed up the process. Do AI tools for PDF accessibility work? Can AI make PDFs accessible?

Optical Character Recognition

There are many reasons that PDFs fail accessibility. One common error involves PDFs with scanned images that may include textual information.

Optical Character Recognition, or OCR, is a technology that can read through written text or text in scanned images and convert it into machine-encoded text that can be copied, cut, edited, and worked on digitally. 

For example, if you run OCR on the image of a handwritten note which says, “Hello, Allyant!”, it will read the handwritten text and produce the words, “Hello, Allyant!” in textual form in a document.

OCR is the first thing you need to do to fix an inaccessible PDF if your PDF is an image or series of images.

Alternative text (alt-text)

AI tools may generate alternative text for images in a document, but they are often completely inaccurate. Relying entirely on AI to create accurate alt-text is not a good strategy; always have a human double-check the auto-generated text or have them create it from the start.

Auto-tagging

A tagged PDF contains tags – tags add behind-the-scenes coding to a PDF. A tag identifies the content type and stores related attributes. In addition, tags also arrange the document into a hierarchical format, which makes the document easy to read using assistive technology like a screen reader.

You might want to read more about tags and how they work in this article

Some AI tools claim they can auto-tag a PDF by scanning the document and assigning tags automatically. 

That seems like a nifty solution to the long-drawn process of manually tagging a PDF document, but it’s far from perfect. If you have ever used Adobe Acrobat DC to remediate a PDF document, you may have seen the quality of the “remediated” output pdf file first-hand. 

For example, if the source document is a Word document that is not formatted correctly with accessibility in mind, its PDF version will also have the same limitations, and a remediated PDF from an auto-tagging tool may fail accessibility.

However, what if an AI tool goes further and tries to understand the PDF document it’s supposed to remediate based on what it has learned through sample accessible PDFs?

How CommonLook AI leverages the power of AI to remediate PDFs

With machine learning, an AI tool can learn by processing structured data. However, machine learning AI tools often produce less than stellar results when provided with complex PDF documents.

This is why CommonLook AI goes beyond machine learning – it employs deep learning to remediate a PDF document. While machine learning requires structured data, deep learning uses artificial neural networks to learn tagging rules. Sample accessible PDF documents “teach” this AI tool how to properly tag PDF documents. And as this solution produces remediated documents, the algorithms are constantly improved to understand the PDFs better and produce documents that are accessible and can easily be accessed through assistive technology. 

This is particularly useful for organizations that have millions of PDFs in their database and need a robust solution to remediate huge volumes of PDFs.

However, the question still remains – with all its power and ability to remediate huge volumes of PDFs very quickly, can an AI-powered tool, like CommonLook AI, produce PDFs that are fully accessible to accessibility standards?

Can AI-based tools produce accessible PDFs without any human intervention?

AI alone cannot produce truly accessible and compliant documents, at least not yet. As mentioned earlier, AI-powered tools can bring you closer to creating accessible PDFs and do that at an incredible speed. Still, they cannot generate PDFs that are fully compliant with accessibility standards.

For instance, an AI-based tool can scan an image and provide an alt-text, but it will have to be reviewed manually by a remediation expert to see if it is accurate. Similarly, if a PDF contains an image of bar charts, an AI-generated alt-text of that image will indeed require a human review.

Even CommonLook AI – the most pathbreaking AI-powered tool in PDF accessibility – can only bring you so close to achieving true PDF accessibility. To achieve 100% compliance with accessibility standards, our professional remediation team works alongwith CommonLook AI to ensure that the final product is accessible to everyone.

In a nutshell, AI-based tools can fix many common accessibility problems with PDF documents. Still, they cannot make them truly accessible and compliant with accessibility standards, at least not yet.