Search where you will find most: Comparing the disciplinary coverage of 56 bibliographic databases | SpringerLink

Abstract:  This paper introduces a novel scientometrics method and applies it to estimate the subject coverages of many of the popular English-focused bibliographic databases in academia. The method uses query results as a common denominator to compare a wide variety of search engines, repositories, digital libraries, and other bibliographic databases. The method extends existing sampling-based approaches that analyze smaller sets of database coverages. The findings show the relative and absolute subject coverages of 56 databases—information that has often not been available before. Knowing the databases’ absolute subject coverage allows the selection of the most comprehensive databases for searches requiring high recall/sensitivity, particularly relevant in lookup or exploratory searches. Knowing the databases’ relative subject coverage allows the selection of specialized databases for searches requiring high precision/specificity, particularly relevant in systematic searches. The findings illustrate not only differences in the disciplinary coverage of Google Scholar, Scopus, or Web of Science, but also of less frequently analyzed databases. For example, researchers might be surprised how Meta (discontinued), Embase, or Europe PMC are found to cover more records than PubMed in Medicine and other health subjects. These findings should encourage researchers to re-evaluate their go-to databases, also against newly introduced options. Searching with more comprehensive databases can improve finding, particularly when selecting the most fitting databases needs particular thought, such as in systematic reviews and meta-analyses. This comparison can also help librarians and other information experts re-evaluate expensive database procurement strategies. Researchers without institutional access learn which open databases are likely most comprehensive in their disciplines.

 

New OpenAlex API features! – OurResearch blog

“We’ve got a ton of great API improvements to report! If you’re an API user, there’s a good chance there’s something in here you’re gonna love.

Search

You can now search both titles and abstracts. We’ve also implemented stemming, so a search for “frogs” now automatically gets your results mentioning “frog,” too. Thanks to these changes, searches for works now deliver around 10x more results. This can all be accessed using the new search query parameter.

 

New entity filters

We’ve added support for tons of new filters, which are documented here. You can now:

get all of a work’s outgoing citations (ie, its references section) with a single query. 
search within each work’s raw affiliation data to find an arbitrary string (eg a specific department within an organization)
filter on whether or not an entity has a canonical external ID (works: has_doi, authors: has_orcid, etc) ….”

Usability and Accessibility of Publicly Available Patient Sa… : Journal of Patient Safety

Abstract:  Objectives 

The aims of the study were to identify publicly available patient safety report databases and to determine whether these databases support safety analyst and data scientist use to identify patterns and trends.

Methods 

An Internet search was conducted to identify publicly available patient safety databases that contained patient safety reports. Each database was analyzed to identify features that enable patient safety analyst and data scientist use of these databases.

Results 

Seven databases (6 hosted by federal agencies, 1 hosted by a nonprofit organization) containing more than 28.3 million safety reports were identified. Some, but not all, databases contained features to support patient safety analyst use: 57.1% provided the ability to sort/compare/filter data, 42.9% provided data visualization, and 85.7% enabled free-text search. None of the databases provided regular updates or monitoring and only one database suggested solutions to patient safety reports. Analysis of features to support data scientist use showed that only 42.9% provided an application programing interface, most (85.7%) provided batch downloading, all provided documentation about the database, and 71.4% provided a data dictionary. All databases provided open access. Only 28.6% provided a data diagram.

Conclusions 

Patient safety databases should be improved to support patient safety analyst use by, at a minimum, allowing for data to be sorted/compared/filtered, providing data visualization, and enabling free-text search. Databases should also enable data scientist use by, at a minimum, providing an application programing interface, batch downloading, and a data dictionary.

Watkinson | What has the COVID-19 pandemic taught us about humanities book publishing so far? A view from North America | The Journal of Electronic Publishing

“Ground down for years by the conflation of lack of physical circulation with a lack of interest, humanities publishers saw the passion unleashed when access to monographs became ubiquitous and easy. Publishers who were long-term skeptics of open access have become proponents, although still worried about how to sustain it financially….

How do we help these readers discover books and journals they can access? As the exponential growth of humanities titles in the Directory of Open Access Books (DOAB) and Directory of Open Access Journals (DOAJ) shows, a lot of literature is becoming permanently open access. However, good luck in doing a subject search for just open access content! Because US libraries have outsourced cataloging to companies such as EBSCO and ProQuest that rely on sales revenue to fund human-powered metadata enrichment, there is little incentive to surface open access books or even identify them as such. Small humanities journals are sometimes less visible because their publishers can’t create and distribute metadata (something DOAJ exists to help with). Academic books are also often invisible to the computers that mine full-text and metadata because the standards used in book publishing cater to print rather than electronic discovery. That’s because the trade giants dominate US book publishing and focus on selling bestsellers through Amazon.com rather than serving the needs of academic libraries. The consequence is that humanities book publishers spend all their efforts on BISAC codes (designed to help booksellers in arranging shelves), ONIX feeds (heavy on availability statuses), and ISBNs (using the same 13-digit UPC format as cereal boxes). Their focus on the print supply chain leaves little time for allocating digital object identifiers (DOIs), Open Researcher and Contributor IDs (ORCIDs), or Research Organization Registry (ROR) identifiers, the building blocks of the digital ecosystem. The challenge of managing temporarily free-to-read materials during the pandemic and the switch to open has catalyzed some libraries to rediscover the importance of “technical services” that were in danger of being consigned to the building’s basement. The combination of untapped demand for poorly tamed information has also opened the doors to increasingly sophisticated informal organizations. The pirate site Z-Library, for example, offers millions of books and journal articles for free with a robust search mechanism and clean user interface. Based probably in Russia, outside the boundaries of copyright policing, Z-Library is both a symptom of unmet global demand and an existential threat to many academic publishers’ current sustainability models.

 

How can librarians and publishers sustain an ecosystem of humanities publishing in which access to the digital version of each title is free? Who pays the cost of publishing in fields that lack the grant funding of science, technical, and medical fields (STM)? The recognition that open access models that require authors to pay article processing charges (APCs) or book publishing charges (BPCs) are fundamentally inequitable to the many who cannot pay has led to new “hybrid” funding models. Several North American university presses have combined parent institutional support, payments from individual libraries and consortia, and grant funding where available to support OA book publishing. These include the Direct to Open program from the MIT Press, Fund to Mission from the University of Michigan Press, and the multi-institutional membership model that powers Lever Press. Beyond the university presses, “scholar-led” publishers such as Punctum Books and many library publishers provide options that rely on substantial volunteer labor and support in kind. All of these models rely on library support to a greater or lesser extent. Already under pressure from the inflationary costs of STM periodicals, this funding may not be able to scale. The Toward an Open Monograph Ecosystem (TOME) initiative is jointly led by the Association of American Universities, Association of Research Libraries, and Association of University Presses. This program aims to bring provosts to the table, providing funding for their faculty members to publish books as open access that is separate from the library’s allotment. An open question that the University of North Carolina Press is exploring is whether individual scholars will be willing to spend money on print copies of books that are available open access. Their Sustainable History Monograph Pilot already suggests that this may vary by field….”

Full article: Emergence of New Public Discovery Services: Connecting Open Web Searchers to Library Content

Abstract:  A growing number of new public citation databases, available free of charge and accessible on the open web, are offering researchers a new place to start their searching, providing an alternative to Google Scholar and library resources. These new “public discovery services” index significant portions of scholarly literature, then differentiate themselves by applying technologies like artificial intelligence and machine learning, to create results sets that are promoted as more meaningful, easier to navigate, and more engaging than other discovery options. Additionally, these new public discovery services are adopting new linking technologies that connect researchers from citation records to full text content licensed on their behalf by their affiliated libraries. With these new sites logging millions of sessions a month, they present unique opportunities for libraries to connect to researchers working outside the library and challenges in how the library can make itself obvious in the user workflow.

 

The need for open access and natural language processing | PNAS

“In PNAS, Chu and Evans (1) argue that the rapidly rising number of publications in any given field actually hinders progress. The rationale is that, if too many papers are published, the really novel ideas have trouble finding traction, and more and more people tend to “go along with the majority.” Review papers are cited more and more instead of original research. We agree with Chu and Evans: Scientists simply cannot keep up. This is why we argue that we must bring the powers of artificial intelligence/machine learning (AI/ML) and open access to the forefront. AI/ML is a powerful tool and can be used to ingest and analyze large quantities of data in a short period of time. For example, some of us (2) have used AI/ML tools to ingest 500,000+ abstracts from online archives (relatively easy to do today) and categorize them for strategic planning purposes. This letter offers a short follow-on to Chu and Evans (hereafter CE) to point out a way to mitigate the problems they delineate….

In conclusion, we agree with CE (1) on the problems caused by the rapid rise in scientific publications, outpacing any individual’s ability to keep up. We propose that open access, combined with NLP, can help effectively organize the literature, and we encourage publishers to make papers open access, archives to make papers easily findable, and researchers to employ their own NLP as an important tool in their arsenal.”

5 ways Google Scholar helps you get access to what you discovered | Aaron Tay’s Musings about librarianship

“If there is one academic discovery search that dominates it is Google Scholar.

Much has been said about it’s merits , particularly over library discovery systems but even the best discovery service will not be popular if it does not help the user access the full text whether open access or based on the user’s own unique circumstances (typically institutional affiliation).

In this blog post, I will list 5 different ways Google Scholar helps a user get to full text. The last two were methods I recently discovered and it seems may not be very well known even by academic librarians.

They are 

1. Free full text tagged [PDF] or [HTML]

2. Library Links programme

3. Library search via Open WorldCat Search

4. The print/or non-electronic holdings option…

5. Subscriber links programme …”

Major update of CORE search  – CORE

“CORE has just released a major update to its search engine, including a sleek new user interface and upgraded search functionality driven by the new CORE API V3.0.

CORE Search is the engine that researchers, librarians, scholars, and others turn to for open access research papers from around the world and for staying up to date on the latest scientific literature….”

Automated search in the archives: testing a new tool | Europeana Pro

“Archives Portal Europe, the online repository for archives from and about Europe, aggregates archival material from more than 30 countries and 25 languages – all searchable through one simple search engine.

In order to help researchers navigating this Babylon of languages, Archives Portal Europe have created an automated topic detection tool that expands the keyword search of a single user to create semantic connections with other documents in different languages. This testing session will allow users to preview the tool (currently in its alpha version), test it, and provide fundamental feedback for its development, and will have prizes! …”

Using open access research in our battle against misinformation – Research

“While scientific papers have been traditionally seen as a source of mostly trustworthy information, their use within automated tools in the fight against misinformation, such as related to vaccine effectiveness or climate changes, has been rather limited….

At CORE, we are committed to a more transparent society, free of misinformation. Our data services, providing access to machine readable information from across the global network of open repositories are a treasure trove for this use case. 

We are therefore excited to support an innovative startup, Consensus, a search engine designed to perform evidence retrieval and assessment over scientific insights.  …”

 

Home – NIH ODSS Search Workshop

“The goal of the Workshop is to explore current capabilities, gaps and opportunities for global data search across the data ecosystem. Workshop will explore selected science drivers across these main themes:

Using search to build cohorts: finding data across different platforms/repositories using patient attributes in order to create a cohort of patients for clinical analysis
Using search to find relevant data & repositories: finding data & repositories in order to access and analyze the data further, including its use for creating computational models.
Using search for (complex) information retrieval: answering specific questions without the additional burden of data download or analysis…”

Home – NIH ODSS Search Workshop

“The goal of the Workshop is to explore current capabilities, gaps and opportunities for global data search across the data ecosystem. Workshop will explore selected science drivers across these main themes:

Using search to build cohorts: finding data across different platforms/repositories using patient attributes in order to create a cohort of patients for clinical analysis
Using search to find relevant data & repositories: finding data & repositories in order to access and analyze the data further, including its use for creating computational models.
Using search for (complex) information retrieval: answering specific questions without the additional burden of data download or analysis…”

Massive open index of scholarly papers launches

“An ambitious free index of more than 200 million scientific documents that catalogues publication sources, author information and research topics, has been launched.

The index, called OpenAlex after the ancient Library of Alexandria in Egypt, also aims to chart connections between these data points to create a comprehensive, interlinked database of the global research system, say its founders. The database, which launched on 3 January, is a replacement for Microsoft Academic Graph (MAG), a free alternative to subscription-based platforms such as Scopus, Dimensions and Web of Science that was discontinued at the end of 2021.

“It’s just pulling lots of databases together in a clever way,” says Euan Adie, founder of Overton, a London-based firm that tracks the research cited in policy documents. Overton had been getting its data from various sources, including MAG, ORCID, Crossref and directly from publishers, but has now switched to using only OpenAlex, in the hope of making the process easier….”

A Search Engine That Finds You Weird Old Books | by Clive Thompson | Jan, 2022 | Debugger

“Still, sifting through old books can be a hassle. You have to go to those search sites and filter for the right vintage (and public-domain-status). It’s a pain.

So: I decided to partly automate this — by making my own search tool.

Behold the Weird Old Book Finder….

Behind the scenes, here’s what it’s doing, which is pretty simple: i) You type in a query, and ii) my app sends it to Google Books, and filters the results for pre-1927 public domain. Then iii) it picks one at random and displays it to you….”

A Search Engine That Finds You Weird Old Books | by Clive Thompson | Jan, 2022 | Debugger

“Still, sifting through old books can be a hassle. You have to go to those search sites and filter for the right vintage (and public-domain-status). It’s a pain.

So: I decided to partly automate this — by making my own search tool.

Behold the Weird Old Book Finder….

Behind the scenes, here’s what it’s doing, which is pretty simple: i) You type in a query, and ii) my app sends it to Google Books, and filters the results for pre-1927 public domain. Then iii) it picks one at random and displays it to you….”