Assessment, Usability, and Sociocultural Impacts of DataONE | International Journal of Digital Curation

Abstract:  DataONE, funded from 2009-2019 by the U.S. National Science Foundation, is an early example of a large-scale project that built both a cyberinfrastructure and culture of data discovery, sharing, and reuse. DataONE used a Working Group model, where a diverse group of participants collaborated on targeted research and development activities to achieve broader project goals. This article summarizes the work carried out by two of DataONE’s working groups: Usability & Assessment (2009-2019) and Sociocultural Issues (2009-2014). The activities of these working groups provide a unique longitudinal look at how scientists, librarians, and other key stakeholders engaged in convergence research to identify and analyze practices around research data management through the development of boundary objects, an iterative assessment program, and reflection. Members of the working groups disseminated their findings widely in papers, presentations, and datasets, reaching international audiences through publications in 25 different journals and presentations to over 5,000 people at interdisciplinary venues. The working groups helped inform the DataONE cyberinfrastructure and influenced the evolving data management landscape. By studying working groups over time, the paper also presents lessons learned about the working group model for global large-scale projects that bring together participants from multiple disciplines and communities in convergence research.

 

COVID-19 Global Research Registry for Public Health and Social Sciences

“Sharing your information will help:

Highlight novel public health and social science research initiated in response to COVID-19
Expand opportunities for research collaboration and reduce duplication of effort
Identify unmet research needs
Create possibilities to share and publish research instruments, data collection and ethics protocols, and data
Set a comprehensive social science research agenda….”

Center for Advancement and Synthesis of Open Environmental Data and Sciences (nsf21549) | NSF – National Science Foundation

“NSF seeks to establish a Center fueled by open and freely available biological and other environmental data to catalyze novel scientific questions in environmental biology through the use of data-intensive approaches, team science and research networks, and training in the accession, management, analysis, visualization, and synthesis of large data sets. The Center will provide vision for speeding discovery through the increased use of large, publicly accessible datasets to address biological research questions through collaborations with scientists in other related disciplines. The Center will be an exemplar in open science and team science, fostering development of generalizable cyberinfrastructure solutions and community-driven standards for software, data, and metadata that support open and team science, and role-modeling best practices. Open biological and other environmental data are produced by NSF investments in research and infrastructure such as the National Ecological Observatory Network (NEON), the Ocean Observatories Initiative (OOI), the Long-Term Ecological Research (LTER) network, National Center for Atmospheric Research (NCAR), Critical Zone Observatories (CZOs), Integrated Digitized Biocollections (iDigBio), and the Global Biodiversity Information Facility (GBIF), as well as by many other public and private initiatives in the U.S. and worldwide. These efforts afford opportunities for collaborative investigation into, and predictive understanding of life on Earth to a far greater degree than ever before. The Center will help develop the teams, concepts, resources, and expertise to enable inclusive, effective, and coordinated efforts to answer the broad scientific questions for which these open data were designed, as well as key questions that emerge at interfaces between biology, informatics, and a breadth of environmental sciences. It will engage scientists diverse in their demography, disciplinary expertise, and geography, and in the institutions that they represent in collaborative, cross-disciplinary, and synthetic studies. It is expected that this new Center will build on decades of experience from NSF’s prior investments in other synthesis centers, while providing visionary leadership and advancement for data-intensive team science in a highly connected and increasingly virtual world. It will serve as an incubator for team-based, data-driven, and open research that includes cyberinfrastructure, tools, services, and application development and innovative and inclusive training programs. The Center is also expected to spur collaborative interactions among the facilities and initiatives that produce open biological and other environmental data, and cyberinfrastructure efforts that support the curation and use of those data, such as Biological and Chemical Oceanography Data Management Office (BCO-DMO), CyVerse, Environmental Data Initiative (EDI), DataOne, EarthCube, and Cyberinfrastructure (CI) Centers for Excellence, to address compelling research questions and to enable training and data product and tool development. The new Center will further enable data-driven discovery through immersive education and training experiences to provide the advanced skills needed to maximize the scientific potential of large volumes of available open data.”

OSF | Center for Open Science – NSF 21-511 AccelNet-Implementation-Community of Open Science Grassroots Networks (COSGN).pdf

“Overview. The Community of Open Scholarship Grassroots Networks (COSGN), includes 107 grassroots networks representing virtually every region of the world and every research discipline These networks communicate and coordinate on topics of common interest. We propose, using an NSF 21-515 Implomentation grant, to formalize governance and coordination of the networks to maximize impact and establish standard practices for sustainability. In the project poriod, we will increase the capacity of COSGN to advance the research and community goals of the participating networks individually and collectively, and establish governance, succession planning, shared resources and communication pathways to ensure an active community sustained network of networks By the end of the project poriod, we will have established a self-sustaining notwork of networks that leverages disciplinary and regional diversity actively collaborates across networks for grassroots organizing, and shares resources for manum impact on culture change for open scholarship.”

Harnessing the Data Revolution (HDR): Institutes for Data-Intensive Research in Science and Engineering (nsf21519) | NSF – National Science Foundation

“In 2016, the National Science Foundation (NSF) unveiled a set of “Big Ideas,” 10 bold, long-term research and process ideas that identify areas for future investment at the frontiers of science and engineering (see https://www.nsf.gov/news/special_reports/big_ideas/index.jsp). The Big Ideas represent unique opportunities to position our Nation at the cutting edge of global science and engineering by bringing together diverse disciplinary perspectives to support convergent research. When responding to this solicitation, even though proposals must be submitted through the Office of Advanced Cyberinfrastructure (OAC) within the Directorate for Computer and Information Science and Engineering (CISE), once received the proposals will be managed by a cross-disciplinary team of NSF Program Directors.

NSF’s Harnessing the Data Revolution (HDR) Big Idea is a national-scale activity to enable new modes of data-driven discovery that will allow fundamental questions to be asked and answered at the frontiers of science and engineering.

This solicitation will establish a group of HDR Institutes for data-intensive research in science and engineering that can accelerate discovery and innovation in a broad array of research domains. The HDR Institutes will lead innovation by harnessing diverse data sources and developing and applying new methodologies, technologies, and infrastructure for data management and analysis. The HDR Institutes will support convergence between science and engineering research communities as well as expertise in data science foundations, systems, applications, and cyberinfrastructure. In addition, the HDR Institutes will enable breakthroughs in science and engineering through collaborative, co-designed programs to formulate innovative data-intensive approaches to address critical national challenges….”

New Report Provides Recommendations for Effective Data Practices Based on National Science Foundation Research Enterprise Convening – Association of Research Libraries

“Today a group of research library and higher education leadership associations released Implementing Effective Data Practices: Stakeholder Recommendations for Collaborative Research Support. In this new report, experts from library, research, and scientific communities provide key recommendations for effective data practices to support a more open research ecosystem. In December 2019, an invitational conference was convened by the Association of Research Libraries (ARL), the California Digital Library (CDL), the Association of American Universities (AAU), and the Association of Public and Land-grant Universities (APLU). The conference was sponsored by the US National Science Foundation (NSF).

The conference focused on designing guidelines for (1) using persistent identifiers (PIDs) for data sets, and (2) creating machine-readable data management plans (DMPs), two data practices that were recommended by NSF. Professor Joel Cutcher-Gershenfeld, of Heller School for Social Policy and Management at Brandeis University, designed and facilitated the convening with the project team….”

SciENcv and ORCID to Streamline NIH and NSF Grant Applications – LYRASIS NOW

“SciENcv is a tool managed by the National Institutes of Health (NIH) that allows researchers to create a biographical sketch (biosketch) to submit with their grant proposals for funding from NIH, and it can now also be used when seeking funding from the National Science Foundation (NSF).

As of October 5, 2020 the National Science Foundation (NSF) will require researchers to submit a biosketch that meets specific format requirements as part of their grant proposal. Researchers are encouraged to use SciENcv to create biosketches, as SciENcv offers a NSF-approved tool that is integrated with ORCID. Researchers can connect their ORCID iD with their SciENcv profile in order to transfer data from their ORCID record into SciENcv by clicking a button, rather than having to manually retype all of their information….”

Journal statistics, coping strategy with upcoming scholarly journal publishing environment including Plan-S, and appreciation for reviewers and volunteers

“It is anticipated that the enactment of immediate open access publication without embargo period for articles will soon be supported by the US federal funding agencies including National Science Foundation and National Institute of Health [3,4]. It may be an extension of the public access policy by the above 2 funding institutes, which mandates free access after 1-year embargo period if the articles are supported by these funding agencies. It is a fortifying policy for open access publication. It may be a good chance for the journal to receive research results that had received US federal funding, because it is the diamond or platinum open access one without embargo period nor article processing charge. However, the situation in Europe is not favorable, where “all scholarly publications on the results from research funded by public or private grants provided by national, regional and international research councils and funding bodies, must be published in open access journals without embargo from 2021” according to Plan-S [5]. There are basic, mandatory, and recommended requirements to be eligible to receive the manuscripts supported by European funding bodies. Out of them, one basic requirement of “copyright owned by authors or institutes” cannot be fulfilled by the journal, because this journal is owned by the public institute publisher and all publishing cost is supported by the publisher. This year, 20% of the published articles were from Europe, although most of those articles were not supported by research grants. The JEEHP should be prepared for the situation in which manuscripts funded by European funding agencies cannot be accepted. However, at present, there seems to be no way to overcome this obstacle, and this may apply to other public or non-profit organization journals as well. I just anticipate a change in the principle of Plan-S on the ownership of copyright. There is no problem in publishing the journal as open access without embargo nor article processing charge although the copyright is owned by the publisher in Korea. Furthermore, the open access policy which may be enacted by the Korean Government in near future should be followed-up and discussed to evade the situation in Europe like Plan-S principle of copyright ownership….”

NSF releases JASON report on research security | NSF – National Science Foundation

“As part of its ongoing effort to keep international research collaboration both open and secure, the National Science Foundation (NSF) today released a report by the independent science advisory group JASON titled “Fundamental Research Security.”

NSF commissioned the report to enhance the agency’s understanding of the threats to basic research posed by foreign governments that have taken actions that violate the principles of scientific ethics and research integrity. With the official receipt of the report, NSF will now begin the process of analyzing its findings and recommendations….

“We expect that a reinvigorated commitment to U.S. standards of research integrity and the tradition of open science by all stakeholders will drive continued preeminence of the U.S. in science, engineering, and technology by attracting and retaining the world’s best talent,” the report says.”

NSF releases JASON report on research security | NSF – National Science Foundation

“As part of its ongoing effort to keep international research collaboration both open and secure, the National Science Foundation (NSF) today released a report by the independent science advisory group JASON titled “Fundamental Research Security.”

NSF commissioned the report to enhance the agency’s understanding of the threats to basic research posed by foreign governments that have taken actions that violate the principles of scientific ethics and research integrity. With the official receipt of the report, NSF will now begin the process of analyzing its findings and recommendations….

“We expect that a reinvigorated commitment to U.S. standards of research integrity and the tradition of open science by all stakeholders will drive continued preeminence of the U.S. in science, engineering, and technology by attracting and retaining the world’s best talent,” the report says.”

Professors Receive NSF Grant to Develop Training for Recognizing Predatory Publishing | Texas Tech Today | TTU

“With more open-access journals making research articles free for people to view, some journals are charging authors publication fees to help cover costs. While some journals that do this are still peer-reviewed and credible, others are not and will publish lower quality work strictly for profit. The difference can be hard to tell, even to the most seasoned author….”

The National Science Foundation Awards scite Competitive R&D Grant to Build Tool to Identify and…

“scite, Inc. has been awarded a National Science Foundation (NSF) Small Business Innovation Research (SBIR)grant for $224,559 to conduct research and development (R&D) work ondeveloping a deep learning platform that can evaluate the reliability of scientific claims by citation analysis….”

Extending U.S. Biodiversity Collections to Promote Research and Education

“Our national heritage of approximately one billion biodiversity specimens, once digitized, can be linked to emerging digital data sources to form an information-rich network for exploring earth’s biota across taxonomic, temporal and spatial scales. A workshop held 30 October – 1 November 2018 at Oak Spring Garden in Upperville, VA under the leadership of the Biodiversity Collections Network (BCoN) developed a strategy for the next decade to maximize the value of our collections resource for research and education. In their deliberations, participants drew heavily on recent literature as well as surveys, and meetings and workshops held over the past year with the primary stakeholder community of collections professionals, researchers, and educators.

Arising from these deliberations is a vision to focus future biodiversity infrastructure and digital resources on building a network of extended specimen data that encompasses the depth and breadth of biodiversity specimens and data held in U.S. collections institutions. The extended specimen network (ESN) includes the physical voucher specimen curated and housed in a collection and its associated genetic, phenotypic and environmental data (both physical and digital). These core data types, selected because they are key to answering driving research questions, include physical preparations such as tissue samples and their derivative products such as gene sequences or metagenomes, digitized media and annotations, and taxon- or locality-specific data such as occurrence observations, phylogenies and species distributions. Existing voucher specimens will be extended both manually and through new automated methods, and data will be linked through unique identifiers, taxon name and location across collections, across disciplines and to outside sources of data. As we continue our documentation of earth’s biota, new collections will be enhanced from the outset, i.e., accessioned with a full suite of data. We envision the ESN proposed here will be the gold standard for the structured cloud of integrated data associated with all vouchered specimens. These permanent specimen vouchers, in which genotypes and phenotypes link to a particular environment in time and space, comprise an irreplaceable resource for the millennia….”

BCoN Report: Extending U.S. Biodiversity Collections to Promote Research and Education

The Biodiversity Collections Network has released its new report, Extending U.S. Biodiversity Collections to Promote Research and Education.  You are invited to download and share the summary brochure and to review the longer report that provides additional detail about this vision for the future. …”

Report urges massive digitization of museum collections | Science | AAAS

“The United States should launch an effort to create an all-encompassing database of the millions of stuffed, dried, and otherwise preserved plants, animals, and fossils in museums and other collections, a U.S. National Science Foundation (NSF)–sponsored white paper released today urges. The report, titled Extending U.S. Biodiversity Collections to Promote Research and Education, also calls for new approaches to cataloging digitized specimens and linking them to a range of other data about each organism and where it was collected. If the plan is carried out, “There will be [a] huge potential impact for the research community to do new types of research,” says NSF biology Program Director Reed Beaman in Alexandria, Virginia.

The effort could take decades and cost as much as half a billion dollars, however, and some researchers are worried the white paper will not win over policymakers. “I just wish that the report focused more on the potential benefits for noncollections communities,” says James Hanken, director of the Harvard Museum of Comparative Zoology in Cambridge, Massachusetts.

For the past 8 years, NSF has sponsored the $100 million, 10-year Advancing Digitization of Biodiversity Collections program, which has paid for nearly 62 million plant and animal specimens to be digitally photographed from multiple angles for specific research studies. New technology has greatly sped up the process. Already, researchers studying natural history and how species are related are reaping the benefits of easy access to a wealth of information previous locked in museums….”