A framework for improving the accessibility of research papers on arXiv.org

Abstract:  The research content hosted by arXiv is not fully accessible to everyone due to disabilities and other barriers. This matters because a significant proportion of people have reading and visual disabilities, it is important to our community that arXiv is as open as possible, and if science is to advance, we need wide and diverse participation. In addition, we have mandates to become accessible, and accessible content benefits everyone. In this paper, we will describe the accessibility problems with research, review current mitigations (and explain why they aren’t sufficient), and share the results of our user research with scientists and accessibility experts. Finally, we will present arXiv’s proposed next step towards more open science: offering HTML alongside existing PDF and TeX formats. An accessible HTML version of this paper is also available at https://info.arxiv.org/about/accessibility_research_report.html 

Access is not the same as accessibility: A framework for making research papers truly open – arXiv.org blog

“arXiv has pioneered open access for more than 30 years by removing financial, institutional, and geographic barriers to research. No paywalls or fees, no login required for reading. This approach – which gives researchers maximum control over the release of their results and broad visibility – transformed the research process and launched the open access movement.

However, access is not the same as accessibility, which is the practice of ensuring access regardless of disability. The vast majority of research papers posted to any journal or platform do not meet basic accessibility standards.

In 2022, arXiv completed intensive user research with over 40 people to determine the extent of the problem, evaluate current mitigation efforts, and consider solutions. This work, informed by arXiv staff, accessibility experts, and arXiv readers and authors who use assistive technology, is posted on arXiv in PDF and HTML formats (arXivID: 2212.07286).

In extensive interviews, our research participants shared that finding research, reading it, preparing documents, and submitting work are all steps in the research process where people encounter barriers. In particular, interpreting math equations, figures, and charts is problematic.

Flexible content can help address these issues. Offering well-formatted HTML, alongside PDF and TeX source, will lead to critical accessibility gains. arXiv’s collaboration with ar5iv, which currently renders HTML for approximately 70% of arXiv papers, is a first step in this process. Next, we expect to reduce the error rate and add a link to HTML on arXiv abstract pages….”

Open Inaccessibility

“When a PDF is downloaded, who can read it?

At the start of the year I discussed the social model of disability and inaccessibility in relation to open scholarship, but since then I have not done much more in a practical sense. Here’s the best explanation of the social model of disability I have seen…

Content inaccessibility came back on my radar again when I read a recent study about content accessibility improvements for arXiv. This paper calls content accessibility “the next frontier of open science.” As we see a simultaneous increase in user-generated content platforms for publishing, where there is less control over what and how things get published, I would agree and argue that accessibility will become a bigger topic quickly.

Some of my main takeaways and juxtapositions from this paper include:

There is clear content inaccessibility: only 30% of people using assistive technologies rate all research as accessible (vs. 59% of people not using assistive technologies).
HTML is preferred for accessibility, but non-disabled people prefer PDFs.
Biggest improvement areas for accessibility are (1) PDF formatting, (2) images (alt texts), (3) math accessibility (e.g., MathML for screenreaders), (4) making data in figures parseable by screen readers.
People who don’t use assistive technologies don’t know what is required of them to make accessible documents
PDF is often preferred because it is easy/easier to save to reference managers….”

ar5iv – Articles from arXiv.org as responsive HTML5 web documents

Converted from TeX with LaTeXML.
Sources upto the end of 2021. Not a live preview service.
For articles with multiple revisions, only the initial v1 is made available.
Goal: incremental improvement until worthy of native arXiv adoption.

Sample: A Simple Proof of the Quadratic Formula (1910.06709)

View any arXiv article URL by changing the X to a 5

https://arxiv.org/abs/1910.06709
https://ar5iv.org/abs/1910.06709

Project MUSE offers nearly 300 “HTML5” open access books on re-designed platform | JHU Press

“Nearly 300 open access (OA) books are now available from Project MUSE, the highly-acclaimed online collection of humanities and social science scholarship, on a newly designed platform that represents a major step forward in OA publishing in these fields.  The books will be delivered in a highly-discoverable and adaptable format using user-friendly HTML5, rather than static PDFs, and will include titles from Johns Hopkins University Press, Cornell University Press, Duke University Press, University of Hawai’i Press, University of Michigan Press, Syracuse University Press, The MIT Press, and Temple University Press….

The initiative was made possible by a two-year grant of nearly $1 million from the Andrew W. Mellon Foundation, which concluded this summer.  Funds were used to develop an open source workflow for transforming epub files into HTML5 and, most importantly for users, launch a scholar-informed redesign of the Project MUSE interface that emphasizes simplicity, accessibility, and personalization. The redesigned platform includes robust support for discovery and linking, along with preservation with trusted third parties, assuring wide dissemination of OA book content on MUSE. The platform’s enhanced analytics services will help publishers understand the impact of making books freely available….”