IDEA: Towards More Accessible Research Papers
On April 17, 2023, arXiv.org, a widely-used preprint server, held the 2023 arXiv Accessibility Forum. The purpose of this forum was to educate attendees on the access needs of people with disabilities, discuss the work that needs to be done to meet those needs, and to plan who is responsible for doing that work, all within the context of academia. As arXiv is a preprint server, the focus was primarily on research papers, which create particular barriers to blind researchers. The forum was attended by NSD postdoc Tyler Hague who provided a report to the Division at a recent staff meeting.
The most readily apparent barrier is the consumption of figures and graphics within a paper. The needed solution for this is to include alt-text (short-form, embedded into the image) and/or extended descriptions (longer-form, typically in a different location). These are additional text that describes what is seen in the image and what underlying message is interpreted from it. This text is in contrast to the image captions which tell the reader what the image is without telling them what it looks like. An added benefit that was discussed is that the process of describing a figure often forces authors to more deeply consider the intended takeaway of a figure as well as why that figure is useful for the publication.
Google Docs has this functionality, as shown below:
The other primary barrier discussed was that of software and file formats. The most common tool used for a blind person to interface with a paper is called a “screen reader”. At its core, this software takes the document as an input and then audibly speaks the text that it reads. When a screen reader encounters an image, it looks for alt-text to speak instead. PDF documents often create difficulties for these tools, as they rely on specific PDF tags to indicate where the text is located.
Until very recently, LaTeX completely omitted these tags from the final document. LaTeX still has no alt-text implementation, but that is in development (per the developers present at the forum). However, this still doesn’t solve the problem as many PDF readers discard the tags when reading in a document. While there is a push to improve the landscape of PDF documents, there was also discussion of moving towards HTML based papers. HTML is the gold standard for accessible documents as it is necessarily tagged and supports all major accessibility designs. Many journals have begun creating HTML versions of papers and arXiv is moving to merge their implementation of this, ar5iv.org, with their main repository soon. The HTML version of a paper can be reached simply by replacing the X with a 5 in the URL (example: original, HTML). Right now anyone interested in helping this effort can take the concrete step of checking the HTML versions of their papers for conversion errors and reporting them to the developers using the issue tracker linked at the bottom of the page, which looks like this:
For researchers who attended the forum, the primary expectation was to practice, teach, and normalize the inclusion of alt-text and extended descriptions to ensure that research is accessible. To paraphrase an organizer of the conference: once we are aware of the needs of those in our community, to choose not to meet them is to choose to exclude them.
Recent DEI topics @ NSD Staff Meetings
To recognize their efforts in the area(s) of Inclusion, Diversity, Equity, and Accountability, the following people received a Luminary Card: Clara Barker (Oxford University), Teresa Calarco, Morgan Morse, Liz Stuart