Study Shows Ensuring Reproducibility in Research Is Needed – IEEE Spectrum

“About 60 percent of IEEE conferences, magazines, and journals have no practices in place to ensure reproducibility of the research they publish. That’s according to a study by an ad hoc committee formed by the IEEE Computer Society to investigate the matter and suggest remedies.

Reproducibility—the ability to repeat a line of research and obtain consistent results—can help confirm the validity of scientific discoveries, IEEE Fellow Manish Parashar points out. He is chair of the society’s Committee on Open Science and Reproducibility….

The goal of the ad hoc committee’s study was to ensure that research results IEEE publishes are reproducible and that readers can look at the results and “be confident that they understand the processes used to create those results and they can reproduce them in their labs,” Parashar says….

Here are three key recommendations from the report:

Researchers should include specific, detailed information about the products they used in their experiment. When naming the software program, for example, authors should include the version and all necessary computer codes that were written. In addition, journals should make submitting the information easier by adding a step in the submission process. The survey found that 22 percent of the society’s journals, magazines, and conferences already have infrastructure in place for submitting such information.
All researchers should include a clear, specific, and complete description of how the reported results were reached. That includes input data, computational steps, and the conditions under which experiments and analysis were performed.
Journals and magazines, as well as scientific societies requesting submissions for their conferences, should develop and disclose policies about achieving reproducibility. Guidelines should include such information as how the papers will be evaluated for reproducibility and criteria code and data must meet….”

Association for Computing Machinery (ACM) Open Access Agreement | Scholarly Publishing – MIT Libraries

“The MIT Libraries has negotiated an innovative open access agreement with the Association for Computing Machinery (ACM) that allows MIT authors to make ACM articles freely available at no cost to them.

Under the agreement, MIT corresponding authors can make all articles and conference proceedings in the ACM Digital Library open access immediately at no cost to the author. Instead, MIT is paying ACM a single bulk fee to cover both article publication costs and subscription access. Authors who elect open access may select a Creative Commons license for article sharing and reuse.

The pilot agreement runs from January 2020 through December 31, 2022, and applies to manuscripts submitted and articles published during that period….”

Research Data Management Challenges in Citizen Science Projects and Recommendations for Library Support Services. A Scoping Review and Case Study

Abstract:  Citizen science (CS) projects are part of a new era of data aggregation and harmonisation that facilitates interconnections between different datasets. Increasing the value and reuse of CS data has received growing attention with the appearance of the FAIR principles and systematic research data management (RDM) practises, which are often promoted by university libraries. However, RDM initiatives in CS appear diversified and if CS have special needs in terms of RDM is unclear. Therefore, the aim of this article is firstly to identify RDM challenges for CS projects and secondly, to discuss how university libraries may support any such challenges.

A scoping review and a case study of Danish CS projects were performed to identify RDM challenges. 48 articles were selected for data extraction. Four academic project leaders were interviewed about RDM practices in their CS projects.

Challenges and recommendations identified in the review and case study are often not specific for CS. However, finding CS data, engaging specific populations, attributing volunteers and handling sensitive data including health data are some of the challenges requiring special attention by CS project managers. Scientific requirements or national practices do not always encompass the nature of CS projects.

Based on the identified challenges, it is recommended that university libraries focus their services on 1) identifying legal and ethical issues that the project managers should be aware of in their projects, 2) elaborating these issues in a Terms of Participation that also specifies data handling and sharing to the citizen scientist, and 3) motivating the project manager to good data handling practises. Adhering to the FAIR principles and good RDM practices in CS projects will continuously secure contextualisation and data quality. High data quality increases the value and reuse of the data and, therefore, the empowerment of the citizen scientists.

ACM Joins Initiative for Open Abstracts

“ACM, the Association for Computing Machinery, has joined the Initiative for Open Abstracts (I4OA), a collaboration between publishers, infrastructure organizations, librarians, and researchers to promote the open availability of abstracts.

By joining I4OA, ACM commits to making abstracts of articles published by ACM available in an open and machine-readable way. Abstracts will be submitted to Crossref, initially for journal articles published by ACM and in a next stage also for conference papers. Bringing abstracts together in a common format in a global cross-disciplinary database offers important opportunities for text mining, natural language processing, and artificial intelligence….”

Free Open-Access Quantum Computer Now Operational

“A new Department of Energy open-access quantum computing testbed is ready for the public. Scientists from Indiana University recently became the first team to begin using Sandia National Laboratories’ Quantum Scientific Computing Open User Testbed, or QSCOUT.

Quantum computers are poised to become major technological drivers over the coming decades. But to get there, scientists need to experiment with quantum machines that relatively few universities or companies have. Now, scientists can use Sandia’s QSCOUT for research that might not be possible at their home institutions, without the cost or restrictions of using a commercial testbed….”

Free Open-Access Quantum Computer Now Operational

“A new Department of Energy open-access quantum computing testbed is ready for the public. Scientists from Indiana University recently became the first team to begin using Sandia National Laboratories’ Quantum Scientific Computing Open User Testbed, or QSCOUT.

Quantum computers are poised to become major technological drivers over the coming decades. But to get there, scientists need to experiment with quantum machines that relatively few universities or companies have. Now, scientists can use Sandia’s QSCOUT for research that might not be possible at their home institutions, without the cost or restrictions of using a commercial testbed….”

Computational social science: Obstacles and opportunities | Science

“An alternative has been to use proprietary data collected for market research (e.g., Comscore, Nielsen), with methods that are sometimes opaque and a pricing structure that is prohibitive to most researchers.

We believe that this approach is no longer acceptable as the mainstay of CSS, as pragmatic as it might seem in light of the apparent abundance of such data and limited resources available to a research community in its infancy. We have two broad concerns about data availability and access.

First, many companies have been steadily cutting back data that can be pulled from their platforms (5). This is sometimes for good reasons—regulatory mandates (e.g., the European Union General Data Protection Regulation), corporate scandal (Cambridge Analytica and Facebook)—however, a side effect is often to shut down avenues of potentially valuable research. The susceptibility of data availability to arbitrary and unpredictable changes by private actors, whose cooperation with scientists is strictly voluntary, renders this system intrinsically unreliable and potentially biased in the science it produces.

Second, data generated by consumer products and platforms are imperfectly suited for research purposes (6). Users of online platforms and services may be unrepresentative of the general population, and their behavior may be biased in unknown ways. Because the platforms were never designed to answer research questions, the data of greatest relevance may not have been collected (e.g., researchers interested in information diffusion count retweets because that is what is recorded), or may be collected in a way that is confounded by other elements of the system (e.g., inferences about user preferences are confounded by the influence of the company’s ranking and recommendation algorithms). The design, features, data recording, and data access strategy of platforms may change at any time because platform owners are not incentivized to maintain instrumentation consistency for the benefit of research.

For these reasons, research derived from such “found” data is inevitably subject to concerns about its internal and external validity, and platform-based data, in particular, may suffer from rapid depreciation as those platforms change (7). Moreover, the raw data are often unavailable to the research community owing to privacy and intellectual property concerns, or may become unavailable in the future, thereby impeding the reproducibility and replication of results….

Despite the limitations noted above, data collected by private companies are too important, too expensive to collect by any other means, and too pervasive to remain inaccessible to the public and unavailable for publicly funded research (8). Rather than eschewing collaboration with industry, the research community should develop enforceable guidelines around research ethics, transparency, researcher autonomy, and replicability. We anticipate that many approaches will emerge in coming years that will be incentive compatible for involved stakeholders….

Privacy-preserving, shared data infrastructures, designed to support scientific research on societally important challenges, could collect scientifically motivated digital traces from diverse populations in their natural environments, as well as enroll massive panels of individuals to participate in designed experiments in large-scale virtual labs. These infrastructures could be driven by citizen contributions of their data and/or their time to support the public good, or in exchange for explicit compensation. These infrastructures should use state-of-the-art security, with an escalation checklist of security measures depending on the sensitivity of the data. These efforts need to occur at both the university and cross-university levels. Finally, these infrastructures should capture and document the metadata that describe the data collection process and incorporate sound ethical principles for data collection and use….”

Dear Colleague Letter

“We would like to inform you about an upcoming major transition for the Journal of Field Robotics.

After 15 years of service, John Wiley and Sons, the publisher has decided not to renew the contract of the Editor in Chief (Sanjiv Singh) and the Managing Editor (Sanae Minick) and hence our term will expire at the end of 2020.

This comes after two years of discussions between new Wiley representatives and the  Editorial Board have failed to converge to a common set of principles and procedures by which the journal should operate. The Editorial Board has unanimously decided to resign….

While this moment calls for creativity and collaboration with the scholarly community to find new models, Wiley is intent on making broad changes to the way that the Journal of Field Robotics is operated, guided mostly by an economic calculation to increase revenue and decrease costs. To do this, they have unilaterally decided to change the terms of the contract that has been constant since the JFR was started in 2005. Wiley has confronted a similar case (European Law Journal) with similar effect— the entire editorial board has resigned in January of 2020….”

Los Alamos National Laboratory Jobs – Digital Library Infrastructure Engineer (Software Developer 2/3) in Los Alamos, New Mexico, United States

“The Research Library ( https://www.lanl.gov/library/ ) seeks a Digital Library Infrastructure Engineer to help imagine, create, and sustain its digital library infrastructure. We support the Laboratory’s paramount mission to solve national security challenges through scientific excellence by delivering essential knowledge services. The durable value of these services depends on a foundation of effective and efficient software infrastructure, for which this role is instrumental.

This Software Engineer will work on a variety of projects supporting management, curation, discovery, dissemination, and preservation of institutional scientific content. Current initiatives involve upgrading specialized content discovery platforms, re-engineering data pipelines, modernizing core repository services, and adapting Agile software development and DevOps practices to our local context….”

Publishing computational research – a review of infrastructures for reproducible and transparent scholarly communication | Research Integrity and Peer Review | Full Text

Abstract:  Background

The trend toward open science increases the pressure on authors to provide access to the source code and data they used to compute the results reported in their scientific papers. Since sharing materials reproducibly is challenging, several projects have developed solutions to support the release of executable analyses alongside articles.

Methods

We reviewed 11 applications that can assist researchers in adhering to reproducibility principles. The applications were found through a literature search and interactions with the reproducible research community. An application was included in our analysis if it (i) was actively maintained at the time the data for this paper was collected, (ii) supports the publication of executable code and data, (iii) is connected to the scholarly publication process. By investigating the software documentation and published articles, we compared the applications across 19 criteria, such as deployment options and features that support authors in creating and readers in studying executable papers.

Results

From the 11 applications, eight allow publishers to self-host the system for free, whereas three provide paid services. Authors can submit an executable analysis using Jupyter Notebooks or R Markdown documents (10 applications support these formats). All approaches provide features to assist readers in studying the materials, e.g., one-click reproducible results or tools for manipulating the analysis parameters. Six applications allow for modifying materials after publication.

Conclusions

The applications support authors to publish reproducible research predominantly with literate programming. Concerning readers, most applications provide user interfaces to inspect and manipulate the computational analysis. The next step is to investigate the gaps identified in this review, such as the costs publishers have to expect when hosting an application, the consideration of sensitive data, and impacts on the review process.

ACM Signs New Open Access Agreements with Four Leading Universities | MIT Libraries News

“ACM, the Association for Computing Machinery, entered into transformative open access agreements with several of its largest institutional customers, including the University of California (UC), Carnegie Mellon University (CMU), Massachusetts Institute of Technology (MIT), and Iowa State University (ISU). The agreements, which run for three-year terms beginning January 1, 2020, cover both access to and open access publication in ACM’s journals, proceedings and magazines for these universities, and represent the first transformative open access agreements for ACM….”

About ACM’s Decision to Sign Letters Regarding OSTP’s Proposal to Mandate Zero Embargo of Research Articles

“There have been some strong reactions to ACM’s decision to sign on to letters to the White House Office of Science and Technology Policy (OSTP) as a response to a new directive that OSTP is preparing to issue. That directive would eliminate the current 12-month embargo period for opening U.S. federally funded research publications.

ACM both supports and enables open access models and has worked to support a long and growing list of open access initiatives (see https://www.acm.org/publications/openaccess), doing so in a responsible and sustainable way. For the past decade, all ACM authors have had the right to post accepted versions of their articles in pre-print servers, personal websites, funder websites, and institutional repositories with a zero embargo. More recently, for example, ACM has introduced the OpenTOC service that enables free full-text downloads from links on conference websites immediately upon publication.

It is important to understand why ACM opted to sign the letters opposed to the OSTP zero embargo directive. A long dialogue between OSTP and scholarly publishers led to broad agreement on the current policy (from 2013) of a 12-month embargo for digital libraries. However, due process was not followed for the proposed change to zero embargo. The new directive fails to take into account the significant progress that has been made by ACM and other societies with respect to open access publication since 2013 and there was no dialogue with stakeholders prior to proposing the change.”

Free Machine Learning Repository Increases Accessibility in Genome Research | Technology Networks

Although the importance of machine learning methods in genome research has grown steadily in recent years, researchers have often had to resort to using obsolete software. Scientists in clinical research often did not have access to the most recent models. This will change with the new free open access repository: Kipoi.

Kipoi enables an easy exchange of machine learning models in the field of genome research. The repository was created by Julien Gagneur, Assistant Professor of Computational Biology at the TUM, in collaboration with researchers from the University of Cambridge, Stanford University, the European Bioinformatics Institute (EMBL-EBI) and the European Molecular Biology Laboratory (EMBL)….”

arXiv Update – January 2019 – arXiv public wiki – Dashboard

“In 2018, the repository received 140,616 new submissions, a 14% increase from 2017. The subject distribution is evolving as Computer Science represented about 26% of overall submissions, and Math 24%. There were about 228 million downloads from all over the world. arXiv is truly a global resource, with almost 90% of supporting funds coming from sources other than Cornell and 70% of institutional use coming from countries other than the U.S….”