Welcome to the Single Source Publishing Community | The Single Source Publishing Community (SSPC) is a network stakeholders from the Open Science community that are interested in Single Source Publishing (SSP) for scholarly purposes – developing open-source software and advocacy.

The PDF is not enough: why science needs open formats – University Library

“In the project period from 2019 to 2021 , the project bundled modern publishing as part of the Hamburg Open Science (HOS) initiativeMany years of experience at the Hamburg University of Technology (TUHH) and the Hamburg State and University Library (SUB). The goal: The development of a socio-technical system for single source publishing, i.e. for generating different output formats from one source format. It was based on open source solutions such as GitLab and Open Journal Systems (OJS) to enable an open alternative approach to the publication of scientific results compared to commercial and proprietary publishing offers….

Former team members of the project have created the Single Source Publishing Community (SSPC)founded. This focuses on scientific writing and publishing with open tools and formats and is a meeting point for researchers, lecturers, publishers and developers. Under the motto “Collaborate more, compete less”, the active members of the community exchange ideas in their monthly meetingson current developments in their projects and discuss strategies for cultural change in the field of scientific publication….

Numerous open-source tools favor the desired sovereignty: software projects such as Open Journal Systems, Viviliostyle, Paged.js, Swapfire , FidusWriter, HedgeDoc, quartoand last but not least pandocare combined in different ways in the community projects to create alternative open systems.

Many projects use the Markdown format as a source, to generate complementary versions of PDF in the form of HTML, JATS/XMLand create EPUB. The latter offer the advantage that they retain the semantic labeling of the information they contain and thus open up a wide range of possible applications in automated text mining processes. At the same time, the usability and reach of published scientific findings increases….”

New Leaves: Riffling the History of Digital Pagination

Abstract:  This article presents a new history of digital pagination. Virtual pagination works very differently from its print correlate. Despite this, encapsulated and paginated formats have gained a solid digital foothold. Nonetheless, many commentators have argued that we must overcome such a reliance on and continuity with print in the digital space. This article charts a fresh history of the development of digital pagination through a revisionist interrogation of three interrelated phenomena: 1. That digital pages do not behave as do their physical correlates but instead mimic earlier historical forms of print that fused pagination, scrolling, and the tablet form. 2. That the development of PDF was almost abandoned by Adobe’s board of directors, who could see no audience for it. 3. That there are other more robust lineages of constraint for digital pages from cinema and television. Drawing on new correspondence with the creators of the PDF format I argue from these historical tracings that nothing was sure about the development of textual pagination in the digital space. Further, the digital page almost never came to the prominence and dominance now presumed in discussions of digital reading.

Harmon | ETDplus Toolkit [Tool Review] | Journal of Librarianship and Scholarly Communication

Abstract:  Electronic theses and dissertations (ETDs) have traditionally taken the form of PDFs and ETD programs and their submission and curation procedures have been built around this format. However, graduate students are increasingly creating non-PDF files during their research, and in some cases these files are just as or more important than the PDFs that must be submitted to satisfy degree requirements. As a result, both graduate students and ETD administrators need training and resources to support the handling of a wide variety of complex digital objects. The Educopia Institute’s ETDplus Toolkit provides a highly usable set of modules to address this need, openly licensed to allow for reuse and adaption to a variety of potential use cases.


ResearchHub | Open Science Community

“ResearchHub’s mission is to accelerate the pace of scientific research. Our goal is to make a modern mobile and web application where people can collaborate on scientific research in a more efficient way, similar to what GitHub has done for software engineering.

Researchers are able to upload articles (preprint or postprint) in PDF form, summarize the findings of the work in an attached wiki, and discuss the findings in a completely open and accessible forum dedicated solely to the relevant article.

Within ResearchHub, papers are grouped in “Hubs” by area of research. Individual Hubs will essentially act as live journals within focused areas, within highly upvoted posts. (i.e the paper and its associated summary and discussion) moving to the top of each Hub.

To help bring this nascent community together and incentivize contribution to the platform, a newly created ERC20 token, ResearchCoin (RSC), has been created. Users receive RSC for uploading new content to the platform, as well as for summarizing and discussion research. Rewards for contributions are proportionate to how valuable the community perceives the actions to be – as measured by upvotes.”


If It’s Open, Is It Accessible? – Association of Research Libraries

“The library and open access (OA) publishing communities have made great strides in making more new scholarship openly available. But have we included readers with vision challenges in our OA plans? Only an estimated 7% of all printed works are available in accessible format, and that statistic might not significantly differ for digital scholarship worldwide….

Libraries need to consider accessibility of the document format, as well as accessibility of the tools and platforms they typically use for OA journal and monograph publishing, storage, and access. According to a blog post by the UX designer for the Directory of Open Access Journals last year, testing of a platform’s web interface can be done easily through free tools such as Lighthouse and Accessibility Insights for Web, both available as web browser extensions, which test accessibility against the World Wide Web Consortium (W3C) Web Content Accessibility Guidelines (WCAG) 2.1 AA.

Earlier this year, the Open Journal Systems (OJS) team at the Public Knowledge Project noted the strides that their Accessibility Interest Group team has made to improve the accessibility of OJS 3.3. Next up, they will be working on a guide to help journal editors create more accessible content within OJS.


This leads to the question of the format of open content. Adobe’s Portable Document Format (PDF), ubiquitous and a de facto standard for digital publishing, is typically not the best format for accessibility. Certainly, PDFs can be made WCAG-compliant, but one must make careful efforts to do so….”

Why most academic journals are following outdated publishing practices

“In his Medium article “Scholarly publishing is stuck in 1999,”

Springer Nature product manager Stephen Cornelius reproaches the outdated publishing practices many academic journals are using to produce online content. He notes that, despite decades of technological advancement, “research publishing seems stuck with those that were employed when it first went online.” Cornelius points to many areas of digital journal publishing that have been designed to mirror print publishing, such as journals formatting online articles as print-based PDFs, despite there being better ways to produce and present content online….

PDFs are rife with limitations as compared to HTML because, unlike HTML, PDFs:

Cannot support embedded multi-media research files such as videos
Have a poor layout for online reading, generally using columns that require readers to scroll up and down to read content on the same page
Are nearly impossible to read on mobile devices because PDFs are a static page (whereas HTML can be made to have a responsive design)
Do not easily allow for clickable references within the text
Are overwhelmingly not search-optimized for online browsers…

A recent article in The Atlantic titled “The Scientific Paper Is Obsolete“ explores the limitations of PDFs and the need for journals, particularly in STEM fields, to adopt internet-based publishing formats in order to support more dynamic presentations of research as well as to make it easier for readers to find articles online….”


PDF Data Extractor (PDE) – A Free Web Application and R Package Allowing the Extraction of Tables from Portable Document Format (PDF) Files and High-Throughput Keyword Searches of Full-Text Articles | bioRxiv

Abstract:  The PDF Data Extractor (PDE) R package is designed to perform comprehensive literature reviews for scientists at any stage in a user-friendly way. The PDE_analyzer_i() function permits the user to filter and search thousands of scientific articles using a simple user interface, requiring no bioinformatics skills. In the additional PDE_reader_i() interface, the user can then quickly browse the sentences with detected keywords, open the full-text article, when required, and convert tables conveniently from PDF files to Excel sheets (pdf2table). Specific features of the literature analysis include the adaptability of analysis parameters and the detection of abbreviations of search words in articles. In this article, we demonstrate and exemplify how the PDE package allows the user-friendly, efficient, and automated extraction of meta-data from full-text articles, which can aid in summarizing the existing literature on any topic of interest. As such, we recommend the use of the PDE package as the first step in conducting an extensive review of the scientific literature. The PDE package is available from the Comprehensive R Archive Network at https://CRAN.R-project.org/package=PDE.


ETDplus Toolkit [Tool Review]

Abstract:  Electronic theses and dissertations (ETDs) have traditionally taken the form of PDFs and ETD programs and their submission and curation procedures have been built around this format. However, graduate students are increasingly creating non-PDF files during their research, and in some cases these files are just as or more important than the PDFs that must be submitted to satisfy degree requirements. As a result, both graduate students and ETD administrators need training and resources to support the handling of a wide variety of complex digital objects. The Educopia Institute’s ETDplus Toolkit provides a highly usable set of modules to address this need, openly licensed to allow for reuse and adaption to a variety of potential use cases.


HighWire at 25: Richard Sever (bioRxiv) looks back – Highwire Press

“10 years later I ended up working at Cold Spring Harbor myself, and continuing my relationship with HighWire from a new perspective. The arXiv preprint server for physics had launched in 1991, and my colleague John Inglis and I had often talked about whether we could do something similar for biology. I remember saying we could put together some of HighWire’s existing components, adapt them in certain ways and build something that would function as a really effective preprint server—and that’s what we did, launching bioRxiv in 2013. It was great then to be able to take that experiment to HighWire meetings to report back on. Initially there was quite a bit of skepticism from the community, who thought there were cultural barriers that meant preprints wouldn’t work well for biology, but 7 years and almost 100,000 papers later it’s still there, and still being served very well by HighWire.

When we launched bioRxiv we made it very explicit that we would not take clinical work, or anything involving patients. But the exponential growth of submissions to bioRxiv demonstrated that there was a demand and a desire for this amongst the biomedical community, and people were beginning to suggest that a similar model be trialed for medicine. A tipping point for me was an OpEd in the New York Times (Don’t Delay News of Medical Breakthroughs, 2015) by Eric Topol (Scripps Research) and Harlan Krumholz (Yale University), who would go on to become a co-founder of medRxiv….”

DAISY Publishes White Paper on the Benefits of EPUB 3 – The DAISY Consortium

“The DAISY Consortium has published a white paper encouraging the use of Born Accessible EPUB 3 files for corporate, government and university publications and documents. This important piece of work recognizes the work of the publishing industry who have embraced EPUB 3  as their format of choice for ebooks and digital publishing and focuses on how this same approach should be used for all types of digital content, both online and offline….”

New business models for the open research agenda | Research Information

“The rise of preprints and the move towards universal open access are potential threats to traditional business models in scholarly publishing, writes Phil Gooch

Publishers have started responding to the latter with transformative agreements[1], but if authors can simply upload their research to a preprint server for immediate dissemination, comment and review, why submit to a traditional journal at all? Some journals are addressing this by offering authors frictionless submission direct from the preprint server. This tackles two problems at once: easing authors’ frustrations with existing journal submission systems[2], and providing a more direct route from the raw preprint to the richly linked, multiformat version of record that readers demand and accessibility standards require….

Dissemination of early-stage research as mobile-unfriendly PDF is arguably a technological step backwards. If preprints are here to stay, the reading experience needs to be improved. A number of vendors have developed native XML or LaTeX authoring environments which enable dissemination in richer formats….”

The Push to Replace Journal Supplements with Repositories | The Scientist Magazine®

“But it’s not just broken hyperlinks that frustrate scientists. As papers get more data-intensive and complex, supplementary files often become many times longer than the manuscript itself—in some extreme cases, ballooning to more than 100 pages. Because these files are typically published as PDFs, they can be a pain to navigate, so even if they are available, the information within them can get overlooked. “Most supplementary materials are just one big block and not very useful,” Cooper says.

Another issue is that these files are home to most of a study’s published data, and “you can’t extract data from PDFs except using complex software—and it’s a slow process that has errors,” Murray-Rust tells The Scientist. “This data is often deposited as a token of depositing data, rather than people actually wanting to reuse it.”…

Depositing material that would end up in supplementary files in places other than the journal is becoming an increasingly common practice. Some academics opt to post this information on their own websites, but many others are turning to online repositories offered by universities, research institutions, and companies. …

There are advantages these repositories provide over journal articles, according to Holt. For one, repositories offer the ability to better store and interact with large amounts of openly accessible data than journals typically do. In addition, repositories’ files are labelled with a digital object identifier (DOI), meaning researchers can easily link to it from a published article and make sure to get credit for their work….”