“The Internet of Samples (iSamples) is a multi-disciplinary and multi-institutional project funded by the National Science Foundation to design, develop, and promote service infrastructure to uniquely, consistently, and conveniently identify material samples, record metadata about them, and persistently link them to other samples and derived digital content, including images, data, and publications….”
This paper aims to propose a set of quantitative statistical indicators for measuring the scientific relevance of research groups and researchers, based on high-impact open-access digital production repositories.
An action research (AR) methodology is proposed in which research is associated with the practice; research informs practice and practice is responsible for informing research in a cooperative way. AR is divided into five phases, beginning with the definition of the problematic scenario and an analysis of the state of the art and ending with conducting tests and publishing the results.
The proposed indicators were used to characterise group and individual output in a major public university in south-eastern Mexico. University campuses hosting a large number of high-impact research groups. These indicators were very useful in generating information that confirmed specific assumptions about the scientific production of the university.
The data used here were retrieved from Scopus and open access national repository of Mexico. It would be possible to use other data sources to calculate these indicators.
The system used to implement the proposed indicators is independent of any particular technological tool and is based on standards for metadata description and exchange, thus facilitating the easy integration of new elements for evaluation.
Many organisations evaluate researchers according to specific criteria, one of which is the prestige of journals. Although the guidelines differ between evaluation bodies, relevance is measured based on elements that can be adapted and where some have greater weight than others, including the prestige of the journal, the degree of collaboration with other researchers and individual production, etc. The proposed indicators can be used by various entities to evaluate researchers and research groups. Each country has its own organisations that are responsible for evaluation, using various criteria based on the impact of the publications.
The proposed indicators assess based on the importance of the types of publications and the degree of collaborations. However, they can be adapted to other similar scenarios.
Abstract: Preprints are an increasingly important component of the scholarly record and preprint platforms have correspondingly grown in number. Academic communities value preprints for the opportunity to share early findings with peers and receive immediate feedback on not-yet-reviewed works. With the COVID pandemic, a broader audience is turning to preprints, as political leaders, journalists, and the public seek new information about the virus. Complications arise, however, when the unvetted nature of these works is not clearly signaled alongside discussions of their findings. In late 2020, Rick Anderson captured these concerns, highlighting cases where discredited preprints remained available to read, presenting a potential for misinformation. Anderson posited that preprint platform providers, not just editors, should ensure adequate preprint vetting and be willing to retract them. With the availability of two new open-source preprint platforms–PKP’s Open Preprint Systems (OPS) and Birkbeck’s Janeway preprint server–library publishers now have familiar, robust infrastructure for entering this space and are a logical home for such services, especially given a strong commitment to a specific research community. But what additional responsibilities must we accept–if any–as publishers of this genre? Should we establish terms for vetting of submissions? Without adequate domain knowledge, how would we enforce, or even audit, such terms? How do we indicate that a specific preprint’s findings have not yet been formally accepted? What about obligations regarding debunked publications? What are the responsibilities of platform providers, publishers, and editors? Should library publishers, as a community of practice, expand on the proposed best practices related to preprint metadata to ensure we are responsible actors in providing access to early research? Panelists will explore these questions during the session’s first half, and invite attendee participation for the second. Registered attendees will receive an advance survey regarding current/planned preprint publishing, in order to identify additional discussion topics.
“Debates are taking place globally as to what role platforms should take with regards to the content they offer or host. In the EU, a first step to regulate this environment was taken with the 2019 EU Digital Single Market Directive (“DSM”). This new piece of legislation and specifically Article 17, presents an opportunity for platforms and the publishing industry to work together to enable the appropriate use of protected content and to ensure an improved experience for platform users.
Article 17 confirms platforms are liable for the content they host unless they receive authorization from the publisher. This could be, for instance through a license or by ensuring the unavailability of protected works. Meanwhile, publishers are obliged to make available “the necessary and relevant information”.
In 2019 STM assembled a working group to develop the necessary technical processes to implement these new obligations, in order to enable platforms to swiftly identify the content and respective policies to make sharing decisions in real-time using technology. These technical processes are described in the Article Sharing Framework, which enables the simple and seamless sharing of content in manners that are consistent with publisher policies….”
“In the European Union, an initiative to address this issue was finalized in a 2019 change to the EU Copyright Directive, the Directive on Copyright in the Digital Single Market (official text here). That new law took effect in June 2019 and must be translated into national law by EU Member States by June 2021. SCNs — some of which qualify as Online Content Sharing Service Providers, or OCSSPs as they are referred to in the Directive — fall within the scope of the new rules and, thus, are required to follow certain steps and obligations if they want to preserve the possibility avoiding liability for copyright infringement under the Directive. In particular, OCSSPs have to make “best efforts to ensure the unavailability” of protected works for which rightsholders have provided “relevant and necessary information”. In other words, in order for platforms to meet their obligation, publishers themselves have an obligation to give information, regarding rights and permissions of content sharing, in a method that can be feasibly leveraged at scale by SCNs….
In order to address this challenge, a team under the STM Association’s, STEC Committee, developed the Article Sharing Framework. The Framework gives scholarly publishers a mechanism to provide SCNs — in machine-actionable form — information about an article’s PDF’s identity and the respective publisher’s sharing policies. This enables SCNs to use the information to determine in an automated way, and in real-time, whether the publisher’s content may be shared….”
Results of scientific experiments and research work, either conducted by individuals or organizations, are published and shared with scientific community in different types of scientific publications such as books, chapters, journals, articles, reference works and reference works entries. One aspect of these documents is their contents and the other is metadata. Metadata of scientific documents could be used to increase mutual cooperation, find people with common interest and research work, and to find scientific documents in the matching domains. The major issue in getting these benefits from metadata of scientific publications is availability of these data in unstructured (or semi-structured) format so that it can not be used to ask smart queries that can help in computing and performing different types of analysis on scientific publications data. Also, acquisition and smart processing of publications data is a complicated as well as time and resource consuming task.
To address this problem we have developed a generic framework named as Linked Open Publications Data Framework (LOPDF). The LOPDF framework can be used to crawl, process, extract and produce machine understandable data (i.e., LOD) about scientific publications from different publisher specific sources such as portals, XML export and websites. In this paper we present the architecture, process and algorithm that we developed to process textual publications data and to produce semantically enriched data as RDF datasets (i.e., open data).
The resulting datasets can be used to make smart queries by making use of SPARQL protocol. We also present the quantitative as well as qualitative analysis of our resulting datasets which ultimately can be used to compute the research behavior of organizations in rapidly growing knowledge society. Finally, we present the potential usage of producing and processing such open data of scientific publications and how results of performing smart queries on resulting open datasets can be used to compute the impact and perform different types of analysis on scientific publications data.
This briefing paper aims to support decision makers at research organisations and research funders to develop new monitoring exercises or assess and improve existing processes to measure the Open Access status of publications.
The availability of data and information on the current state of scholarly publishing is invaluable to help advance Open Access. Given the complexity of the scholarly publishing system, this involves a multitude of decisions.
This briefing paper provides recommendations on the three main questions an organisation should answer to develop a monitoring exercise: Why, What, and How?
Examples of different monitoring exercises have been selected to represent different use cases, organisational setups, data sources, and strategies of interpretation.
“See the list of things you can do to fulfill your pledge.
Explore metadata practices that metadata Creators, Curators, Custodians and Consumer use to take action
Learn more about the Metadata 2020 project and who was involved
Consider a framework for measuring the impact of your metadata efforts
Review existing metadata best practices….”
Abstract: The Metadata 2020 initiative is an ongoing effort to bring various scholarly communications stakeholder groups together to promote principles and standards of practice to improve the quality of metadata. To understand the perspectives and practices regarding metadata of the main stakeholder groups (librarians, publishers, researchers and repository managers), we conducted a survey during summer 2019. The survey content was generated by representatives from the stakeholder groups. A link to an online survey (17 or 18 questions depending on the group) was distributed through multiple social media, listserv, and blog outlets. Responses were anonymous, with an optional entry for names and email addresses for those who were willing to be contacted later. Complete responses (N=211; 87 librarians, 27 publishers, 48 repository managers, and 49 researchers) representing 23 countries on four continents were analyzed and summarized for thematic content and ranking of awareness and practices. Across the stakeholder groups, the level of awareness and usage of metadata methods and practices was highly variable. Clear gaps across the groups point to the need for consolidation of schema and practices, as well as broad educational efforts in order to increase knowledge and implementation of metadata in scholarly communications.
“The DOAJ Seal is awarded to journals that demonstrate best practice in open access publishing. Around 10% of journals indexed in DOAJ have been awarded the Seal.
Journals do not need to meet the Seal criteria to be accepted into DOAJ.
There are seven criteria which a journal must meet to be eligible for the DOAJ Seal. These relate to best practice in long term preservation, use of persistent identifiers, discoverability, reuse policies and authors’ rights….”
“The archive’s catalog currently holds more than 120 million digital records, as well as “archival metadata and other types of records, including electronic databases.” However, the system has “an unsophisticated search” function, according to a request for information.
While NARA employees add metadata tags to digital records, “There is a delta between what NARA has been able to describe and the specific information that users want from our records,” the RFI states, asking, “Can AI fill the gap?”
During an informational day held in early April, NARA executives outlined some of the challenge, including a single search returning a flood of results from the same source—making it difficult to sift through to find multiple sources—and difficulty distinguishing between records with similar names, such as a search for “Truman” the president versus “Truman” the aircraft carrier.
The current search function also is not able to return accurate results if the search term input is not exactly the same as it exists in the metadata.
The RFI is seeking feedback on automated solutions that can analyze how users search the digital archives and associate those search terms with the appropriate record….”
“The [National Archives] Catalog currently has a large data set (over 100 million digital pages of records, plus archival metadata and other types of records, including electronic databases) and an unsophisticated search. The archival hierarchy of the records is intended to assist the user in discovery, but in the digital realm, users find it difficult to use. The metadata that we have entered manually cannot provide the granular information for users to get the search results they want and it has taken NARA decades to produce. There is a delta between what NARA has been able to describe, and the specific information that users want from our records. Can AI fill the gap?…”
“As more data is made openly accessible as a part of journal articles or federal funder requirements, the importance of data curation can not be over-emphasized. Data is not intrinsically useful. Furthermore, datasets do not simply become useful because they are publicly available. Data is useful only insofar as it meets the needs of the user. Likewise, more data does not mean more value (Binggeser, 2017). Data is of the highest value for those who collected it. Others who were not involved in the data collection and analysis efforts can find data less useful for their needs, especially if the data is not properly curated. Including as supplemental information a dataset that has not been properly prepared for public use reduces the usefulness of the data. Data must be cleaned and prepared properly for it to be useful. And this process does not happen by accident; it must be purposely conducted by someone trained in properly curating a dataset for public use (Johnston et al, 2018)….
What value does the curation process provide for data? The data curation steps formalized by the DCN in the C.U.R.A.T.E.D. acronym include the following: Check (the files for completeness and viability), Understand (the contents), Request (additional information), Augment (metadata), Transform (to open formats), Evaluate (for FAIRness), and Document (the curation process) (Johnston et al, 2018). …”
“To speed up the transformation to Open Access (OA), the German Federal Ministry of Education and Research (BMBF) will be funding 20 innovative projects over the next two years. We are proud to announce that ScienceOpen is participating with a project around Open Access book metadata for increased discoverability, and we would like to give a preview here of what we are working on.
As the ecosystem of book publishing continues to adhere more to traditional print processes, the industry’s infrastructures are developing new ways to improve the discovery and visibility of academic books content.
While publishers have been looking into additional outlets and platforms to represent, promote, or sell their print and e-publications portfolio, librarians and service providers are making great advances to overhaul their (e-)catalogues and databases. However, much of the book and monograph output still could do with a little boost in visibility and simplified communication between various the systems and platforms. OA books are essential for the transformation of the whole scholarly landscape, and one of the greatest advantages of open access for monographs is full, immediate accessibility. But even those sometimes suffer from a lack of available digitalized bibliographic data. Thus, discoverability of OA books can be lower because of missing metadata or due to missing portability and interoperability based on an incompatibility of formats or plain differences in requirements that prevent uptake in library catalogues and search portals or databases. In a nutshell: Books that cannot be found cannot be read….”
“Principle I: Universal open access
The record of published science is a vital source of ideas, observations, evidence and data that provide fuel and inspiration for further enquiry, and is a profound part of the edifice of human knowledge.
That record, including the back catalogues of publishers, should be regarded as a global public good, openly and perennially free to read by citizens, researchers and all societal stakeholders….
Principle II: Open licensing
The progress of science depends on the ability to access and interrogate evidence and conclusions from past work. Open licences help to promote accountability and traceability, permit authors to continue to derive benefit from their work and maximize the extent to which the work can be built on by others. Yet when submitting to journals, authors may be required to transfer copyright to publishers.
As new technologies enhance the capacity to interrogate the whole record of science to discover new knowledge, pathways to access the resources that could facilitate such discovery should be open to all, unrestricted by licensing or ability to pay….”