Zum Inhalt springen
BFSG compliance since 2025
All Articles Barrierefreiheit

Accessible PDFs: Document Accessibility per PDF/UA

12 min read
PDF/UABarrierefreie PDFsDokumentenzugänglichkeitTagged PDFBFSG

PDFs are the most widespread document format on the web and simultaneously one of the biggest accessibility problems. An estimated 95% (WebAIM, 2025) of all PDFs on the internet are not accessible (European Disability Forum, 2024). They contain neither tags for document structure nor alt text for images, no defined reading order and no correct table associations. For screen reader users, such documents are unreadable. The BFSG and the PDF/UA standard (ISO 14289) define clear requirements for how PDFs must be made accessible.

Accessible PDFs: Document Accessibility per PDF/UAPDF Structure (Tag Tree)DocumentH1: TitleSect: SectionP: ParagraphFigureTags define structure and reading orderH1-H6, P, Table, Figure, List, Link, FormPDF/UA RequirementsComplete tag tree (no untagged content)Correct reading orderAlt text for all imagesDocument language definedTables with header associationCreation Paths for Accessible PDFsWord with StylesInDesign with TagsHTML to PDFRetroactive TaggingBasic structure automaticFull controlProgrammaticAcrobat ProTesting ToolsPAC 2024 (PDF Accessibility Checker)Acrobat Pro Accessibility CheckStandardsPDF/UA-1 (ISO 14289-1)PDF/UA-2 (ISO 14289-2, 2024)Tagged PDFs | Reading Order | Alt Text | Table Headers | Forms | Document Language | Bookmarks

Why PDFs Are a Particular Challenge

PDFs were originally developed as a print-oriented format. Page structure is based on absolute positioning: every text element has a fixed X/Y position on the page. For a screen reader that must read content linearly, this position information is useless. Without additional structural information in the form of tags, the screen reader cannot determine which text is a heading, where a paragraph begins and ends, in what order multi-column layouts should be read and which texts belong to a table.

The problem is exacerbated with complex layouts: multi-column documents, sidebars, footnotes and marginalia are read in arbitrary order without tags. A two-column document may be read line by line, alternating from left and right columns, making the content incomprehensible. Only through correct tags and a defined reading order does the document become accessible to assistive technologies.

Additionally, many PDFs exist as scanned images, such as contracts, invoices or old documents. A scanned PDF contains no text, only an image of the page. Without OCR processing (Optical Character Recognition), the text content is completely invisible to screen readers. The combination of missing text and missing structure makes such documents the biggest barrier in the document area.

The PDF/UA Standard: Understanding Requirements

The PDF/UA standard (Universal Accessibility, ISO 14289) defines technical requirements for accessible PDF documents. The first version PDF/UA-1 was published in 2012, the current version PDF/UA-2 in 2024. The standard requires that every content element in the document is tagged, all images have alt text, the reading order is correctly defined, the document language is set and forms have accessible labels.

PDF/UA-1 is based on PDF 1.7 and defines minimum requirements for tag structure, use of standard tags (H1-H6, P, Table, Figure, List) and designation of artifacts for decorative elements. PDF/UA-2 extends the standard to PDF 2.0 and introduces improved annotation types, MathML support and namespace processing.

For BFSG compliance, PDF/UA-1 is the relevant standard, as it is referenced by EN 301 549 as the reference for accessible PDF documents. Testing is performed with specialized tools that automatically validate the tag tree, reading order, alt text and table structure against the standard.

Tagged PDFs: The Foundation of Accessibility

Tags in a PDF function similarly to HTML elements: they define the semantic structure of the document. A tag tree assigns a role to each content element: H1 for the main heading, P for paragraphs, Table for tables, Figure for images, List and ListItem for lists, Link for hyperlinks. Elements with no content value, such as page numbers, headers and footers or decorative lines, are marked as artifacts and ignored by screen readers.

The tag tree simultaneously defines the reading order. Screen readers read tags in the order of the tag tree, not in visual order on the page. For a two-column layout, the tag tree must ensure the entire left column is read first, then the right. For complex layouts with sidebars, it must define when the sidebar appears in the reading flow.

Tag creation ideally occurs in the source document. Word documents that consistently use styles produce a basic tag tree on PDF export. InDesign offers extended tagging functions for complex layouts. Already exported PDFs can be tagged retroactively in Acrobat Pro, though this is more labor-intensive than creation in the source format.

Alt Text, Tables and Forms in PDFs

Alt text for images in PDFs follows the same principles as on the web: informative images receive descriptive alt text, decorative images are marked as artifacts. In Acrobat Pro, alt text is assigned via the tag panel by selecting the Figure tag and entering the alternative text in the tag properties. Complex images like charts can additionally receive a longer description text.

Tables in PDFs require special care. Every table must be structured as a Table tag with TR (Table Row), TH (Table Header) and TD (Table Data) tags. Header association defines which cells are headers and which data cells they are associated with. Without this association, a screen reader cannot meaningfully read the table: the user hears only a sequence of cell values without context.

PDF forms must contain accessible form fields with labels (tooltips), default values and order specifications. Every form field needs a description that the screen reader reads. Required fields are marked, and validation rules can contain hint text. Our training programs teach the practical skills for creating accessible PDF forms.

Tag Structure

Complete tag tree with H1-H6, P, Table, Figure, List. Decorative elements marked as artifacts.

Alt Text

Every informative image with descriptive alt text. Complex graphics with extended description.

Table Headers

TH tags with correct scope. Header association for multi-dimensional tables. Empty cells marked.

Reading Order

Tag tree defines the logical reading order. Multi-column layouts correctly sequenced.

Forms

All fields with tooltip labels. Tab order defined. Required fields and validation.

Bookmarks

Navigation bookmarks for documents over 20 pages. Mirror the heading hierarchy.

Creating PDFs from Word and InDesign

Microsoft Word is the most common source tool for PDF creation. When Word documents consistently use styles (Heading 1, Heading 2, Body Text), the PDF export via Save as PDF produces a basic tag tree. Images with alt text in Word automatically receive Figure tags with alt text in the PDF. Lists are exported as List tags. The quality of the PDF export depends directly on the quality of the Word formatting.

Adobe InDesign offers extended capabilities for accessible PDFs. Via the tag panel, tags can be manually assigned and reading order defined. InDesign supports assigning alt text to images, defining table headers and creating accessible forms. For complex layouts with multiple columns, sidebars and embedded graphics, InDesign is the tool of choice.

Regardless of the creation path, every PDF must be tested and if necessary post-processed after export. The PDF Accessibility Checker (PAC 2024) is the most comprehensive automated tool and tests against PDF/UA-1 and WCAG 2.2. PAC identifies missing tags, incorrect tag assignments, missing alt text, reading order problems and missing document properties like language and title.

BFSG Requirements: Which PDFs Are Affected

With the Barrierefreiheitsstärkungsgesetz (BFSG, German Accessibility Strengthening Act), effective since June 28, 2025, digital documents in e-commerce are also subject to accessibility requirements. This particularly affects product data sheets, terms and conditions and withdrawal instructions, invoices and delivery notes, manuals and assembly instructions and all PDFs that are part of the digital ordering process. Companies that do not provide these documents in an accessible format risk warnings and fines.

Prioritization should be risk-based: documents directly connected to the purchasing process (terms and conditions, withdrawal instructions, invoices) have the highest priority. Marketing materials and historical documents can be addressed subsequently. A structured implementation plan helps fulfill the requirements systematically and economically.

Language and Reading Direction: Often Overlooked Requirements

A frequently overlooked aspect of accessible PDFs is correct language tagging. Screen readers use the language stored in the document to select the appropriate pronunciation engine. If the language specification is missing or incorrectly set, the screen reader reads German text with English pronunciation -- the result is incomprehensible. According to WCAG 2.2 (Success Criterion 3.1.1), the default language of the document must be programmatically determinable.

For multilingual documents, the requirement goes further: every paragraph written in a language different from the document language must be individually tagged (Success Criterion 3.1.2). A German PDF containing an English quote must mark that quote as English so the screen reader automatically switches to the English voice. In practice, this frequently affects technical terms, foreign-language product names and legal citations.

Reading direction is another quality criterion invisible in visual presentation but crucial for assistive technologies. PDFs converted from multi-column layouts often have incorrect reading order: the screen reader jumps between columns instead of reading them sequentially. The correct reading order must be explicitly defined in the tag tree -- this is particularly relevant for brochures, flyers and catalogs frequently used in e-commerce.

Making Existing PDFs Retroactively Accessible

Many companies have a large inventory of PDFs that are not accessible. Retroactive accessibility requires tagging in Acrobat Pro: the entire content is reviewed, each element receives the appropriate tag, reading order is corrected, alt text is added and artifacts are marked. For simple documents, this process takes 15 to 30 minutes per page; for complex layouts, considerably longer.

For scanned PDFs, additional OCR processing is necessary. Acrobat Pro and specialized OCR software recognize text in the scan and create a searchable text layer. OCR quality depends on scan resolution and print quality. After OCR, the document must be tagged and reading order defined. For large document inventories, prioritization is recommended: first make the most-downloaded and legally relevant documents accessible.

An alternative to retroactive PDF editing is recreation from the source document. If the Word or InDesign original is available, it is often more efficient to apply styles and tags to the source document and re-export than to retroactively tag the PDF. Long-term, we recommend establishing an accessible document workflow that covers all steps from creation through testing to publication.

Testing and Validating PDF Accessibility

Testing PDF accessibility combines automated tests with manual validation. The PDF Accessibility Checker (PAC 2024) is the most comprehensive automated tool and tests against PDF/UA-1 and WCAG 2.2. PAC identifies missing tags, incorrect tag assignments, missing alt text, reading order problems and missing document properties like language and title.

Manual testing includes reading the document with a screen reader to check the actual user experience. Is content read in the correct order? Are all images described? Can tables be meaningfully navigated? Are form fields correctly labeled? These tests reveal problems that automated tools cannot detect.

For companies with many documents, we recommend a systematic process: create an inventory of all PDFs, prioritize by relevance and access frequency, address the most important documents first and simultaneously establish an accessible creation process for new documents. A professional WCAG audit also includes testing of provided PDF documents.

Validation of accessible PDFs occurs on three levels: first, automated testing with tools that analyze the tag tree, reading order and technical PDF/UA requirements. Second, manual testing with a screen reader to evaluate the actual user experience -- a technically valid PDF can still be difficult to understand if the text structure is not logical. Third, content review: are alt texts meaningful? Are tables sensibly labeled? Are forms operable? Only when all three levels pass can a PDF be considered accessible.

For organizations with large PDF inventories, a prioritized approach is recommended: first, the most frequently accessed documents are identified -- product catalogs, price lists, terms and conditions, order forms. These are remediated first as they affect the most users. Less frequented documents follow in the second wave. New documents are created accessibly from the start by adapting creation processes accordingly. This pragmatic approach enables gradual BFSG compliance without overwhelming ongoing operations.

This article is based on data from: ISO 14289-1 PDF/UA Standard (2012), ISO 14289-2 PDF/UA-2 (2024), European Disability Forum PDF Accessibility Report (2024), W3C WCAG 2.2 Recommendation (2023), PAC 2024 Documentation.

Related Articles