Abstract: The role played by research scholars in the dissemination of scientific knowledge on social media has always been a central topic in social media metrics (altmetrics) research. Different approaches have been implemented to identify and characterize active scholars on social media platforms like Twitter. Some limitations of past approaches were their complexity and, most importantly, their reliance on licensed scientometric and altmetric data. The emergence of new open data sources like OpenAlex or Crossref Event Data provides opportunities to identify scholars on social media using only open data. This paper presents a novel and simple approach to match authors from OpenAlex with Twitter users identified in Crossref Event Data. The matching procedure is described and validated with ORCID data. The new approach matches nearly 500,000 matched scholars with their Twitter accounts with a level of high precision and moderate recall. The dataset of matched scholars is described and made openly available to the scientific community to empower more advanced studies of the interactions of research scholars on Twitter.
“OpenAlex is a free and open Scientific Knowledge Graph (SKG). It contains information describing approximately 230M scholarly works, drawn from both structured (eg: Crossref) and unstructured (eg: institutional repositories, publisher websites) sources, clustered/merged into distinct records, and linked by citations. By parsing work metadata and enriching it with external PID sources (ROR, ORCID, ISSN Network, PubMed, Wikidata, etc), OpenAlex describes and links (approximately) 200M author clusters, 100k institutions, and100k venues (journals and repositories). Using a neural-net classifier, we assign one or more of 50k Wikidata concepts to each work. All source code and ML models are available openly, and data is freely available via a high-performance API, a complete database dump, and a search-engine-style web interface. This talk will describe the construction of OpenAlex, compare it to other SKGs (eg Scopus, MAG), and discuss plans for the future.”
Last month we examined the large degree of consolidation in journals publishing. We saw that 95% of publishers publish 10 journals or fewer, but account for barely one fifth of articles published. Meanwhile, half of total scholarly output is published by just 10 publishers, those with the largest numbers of journals.
We can further analyze the market’s consolidation by comparing annual growth rates in the numbers of publishers, journals and articles….
By looking at the trends, some clear patterns emerge.
The numbers of publishers (in blue) grew more quickly in the mid-teens than before or since. This is consistent with the S-shaped curve in the numbers of publishers we noted last month. So it seems the market showed signs of fragmentation in the mid-teens, followed by consolidation more recently.
Growth in numbers of journals (in orange) accelerated until about 2017, then started to fall off. This happened in tandem with the slowing growth in the numbers of publishers.
The rate of growth in numbers of articles (in grey) seems to run counter to the trends above. On average it was flat (at around 5%-6%) until 2018/2019, but then it accelerated. We think much this is because of the unusually high levels of submission in the wake of COVID (as we discussed in our market sizing analysis last year)….
The data also suggest that growth in publisher and journal numbers has slowed, while growth in output has accelerated. Over the last few years – irrespective of Covid effects – it seems the larger publishers are producing larger journals, and the smaller publishers smaller ones. Larger organizations may be able to produce things more efficiently than smaller ones. Meanwhile, the rise of Open Access and reduction in reliance on print works removes constraints on publication sizes….”
“Inspired by the ancient Library of Alexandria, OpenAlex indexes the world of scholarly research, including works, citations, authors, journals, and institutions. OpenAlex data is completely free and open to all via a web interface, API, and database snapshot. Join us to learn how to use the OpenAlex API for your scholcomm research needs. OpenAlex was created by OurResearch, a nonprofit that makes open scholarly infrastructure including Unpaywall (an index of the world’s Open Access research literature) and Unsub (a tool to help librarians eliminate toll-access journal subscriptions). …”
“That our market is highly consolidated is probably not surprising. But the extent of the polarization – and the length of the long tail – might be. Half of total scholarly output is published by just 10 publishers, each of whom publish 400 or more journals. 80% of that is accounted for by the top 5.”
We present hoaddata, an experimental R package that combines open scholarly data from the German Open Access Monitor, Crossref and OpenAlex. Using this package, we illustrate the progress made in publishing open access content in hybrid journals included in nationwide transformative agreements in Germany across journal portfolios and countries.
“With this release, we are pleased to announce the initial integration of OpenAlex data into The Lens. Developed by the team at OurResearch, who also provide UnPaywall, ImpactStory and other open tools for the research community, OpenAlex was initiated to provide a replacement for Microsoft Academic Graph (MAG, see The Lens Scholarly MetaRecord Strategy: Beyond Microsoft Academic Graph).
In this initial phase of OpenAlex integration, we have started ingesting the additional scholarly works that were not present in MAG, as well as supplementing some of the metadata gaps left after the retirement of MAG including Fields of Study and Open Access information. This has resulted in the addition of nearly 6M records in The Lens now including OpenAlex identifiers.
In future phases, we will be expanding the coverage of OpenAlex in The Lens as the OpenAlex dataset matures and the MetaRecord merging logic is established….
With the addition of OpenAlex, we have also added open access information from OpenAlex as a new open access data source (e.g. open_access.source:openalex). Still in beta, open access information from OpenAlex will be merged with open access evidence from other sources to improve open access information. The data sources for open access evidence include: doaj, pmc-nih, core, unpaywall, openalex and rxiv….”
OpenAlex is a new, fully-open scientific knowledge graph (SKG), launched to replace the discontinued Microsoft Academic Graph (MAG). It contains metadata for 209M works (journal articles, books, etc); 2013M disambiguated authors; 124k venues (places that host works, such as journals and online repositories); 109k institutions; and 65k Wikidata concepts (linked to works via an automated hierarchical multi-tag classifier). The dataset is fully and freely available via a web-based GUI, a full data dump, and high-volume REST API. The resource is under active development and future work will improve accuracy and coverage of citation information and author/institution parsing and deduplication.
“We’ve got a ton of great API improvements to report! If you’re an API user, there’s a good chance there’s something in here you’re gonna love.
You can now search both titles and abstracts. We’ve also implemented stemming, so a search for “frogs” now automatically gets your results mentioning “frog,” too. Thanks to these changes, searches for works now deliver around 10x more results. This can all be accessed using the new search query parameter.
New entity filters
We’ve added support for tons of new filters, which are documented here. You can now:
get all of a work’s outgoing citations (ie, its references section) with a single query.
search within each work’s raw affiliation data to find an arbitrary string (eg a specific department within an organization)
filter on whether or not an entity has a canonical external ID (works: has_doi, authors: has_orcid, etc) ….”
“…The OpenAccess object
The OpenAccess object describes access options for a given work. It’s only found as part of the Work object.
Boolean: True if this work is Open Access (OA).
There are many ways to define OA. OpenAlex uses a broad definition: having a URL where you can read the fulltext of this work without needing to pay money or log in. You can use the alternate_host_venues and oa_status fields to narrow your results further, accommodating any definition of OA you like.
String: The Open Access (OA) status of this work. Possible values are:
gold: Published in an OA journal that is indexed by the DOAJ
green: Toll-access on the publisher landing page, but there is a free copy in an OA repository.
hybrid: Free under an open license in a toll-access journal.
bronze: Free to read on the publisher landing page, but without any identifiable license.
closed: All other articles.
String: The best Open Access (OA) URL for this work.
Although there are many ways to define OA, in this context an OA URL is one where you can read the fulltext of this work without needing to pay money or log in. The “best” such URL is the one closest to the version of record.
This URL might be a direct link to a PDF, or it might be to a landing page that links to the free PDF
“VOSviewer is a software tool for constructing and visualizing bibliometric networks. These networks may for instance include journals, researchers, or individual publications, and they can be constructed based on citation, bibliographic coupling, co-citation, or co-authorship relations. VOSviewer also offers text mining functionality that can be used to construct and visualize co-occurrence networks of important terms extracted from a body of scientific literature.”
“An ambitious free index of more than 200 million scientific documents that catalogues publication sources, author information and research topics, has been launched.
The index, called OpenAlex after the ancient Library of Alexandria in Egypt, also aims to chart connections between these data points to create a comprehensive, interlinked database of the global research system, say its founders. The database, which launched on 3 January, is a replacement for Microsoft Academic Graph (MAG), a free alternative to subscription-based platforms such as Scopus, Dimensions and Web of Science that was discontinued at the end of 2021.
“It’s just pulling lots of databases together in a clever way,” says Euan Adie, founder of Overton, a London-based firm that tracks the research cited in policy documents. Overton had been getting its data from various sources, including MAG, ORCID, Crossref and directly from publishers, but has now switched to using only OpenAlex, in the hope of making the process easier….”
“OpenAlex launched this week! (January 3rd 2022 for those reading from the future)
We’re now pulling in new content on our own. Until now, we’ve been getting new works, authors, and other entities from MAG. Now that MAG is gone, we’re gathering all of our own data from the big wide internet.
The new REST API is launched! This is a much faster and easier way to access the OpenAlex database than downloading and installing the snapshot. It’s completely open and free–you don’t even need a user account or token.
We’ve now got oodles of new documentation here: https://docs.openalex.org/
Slight change of plan:
The MAG Format snapshot is now hosted for free, thanks to the AWS Open Data program. This will cover the data transfer fees (which turned out to be $70!) so you don’t have to. Here are the new instructions on how to download the MAG format snapshot to your machine.
We are extending the beta period for OpenAlex; we’ll emerge from beta in February. This is mostly in response to discovering issues with the coverage and structure of existing data sources including MAG. Extending the beta reflects the fact that the data will improve significantly between now and February.
Huge exciting news:
OpenAlex was built to offer a drop-in replacement for MAG. We’re doing that. But today, we’re also unveiling some moves toward a more innovative future for Openalex:
We’ve now built around a simple new five-entity model: works, authors, venues (journals and repositories), institutions, and concepts. Everything in OpenAlex is one of these entities, or a connection between them. Each type of entity has its own API endpoint.
We’ve got a new Standard Format for the snapshot, one that’s closely tied to both the five-entity model the API. In the future, this will become the only supported format. The MAG format is now deprecated and will go away on July 1, 2022. …”
“OpenAlex is a fully open catalog of the global research system. It’s named after the ancient Library of Alexandria.
The OpenAlex dataset describes scholarly entities and how those entities are connected to each other. There are five types of entities:
Works are papers, books, datasets, etc; they cite other works
Authors are people who create works
Venues are journals and repositories that host works
Institutions are universities and other orgs that are affiliated with works (via authors)
Concepts tag Works with a topic
Together, these make a huge web (or more technically, heterogeneous directed graph) of of hundreds of millions of entities and over a billion connections between them all….”
“OpenAlex is a free and open catalog of the world’s scholarly papers, researchers, journals, and institutions — along with all the ways they’re connected to one another.
Using OpenAlex, you can build your own scholarly search engine, recommender service, or knowledge graph. You can help manage research by tracking citation impact, spotting promising new research areas, and identifying and promoting work from underrepresented groups. And you can do research on research itself, in areas like bibliometrics,science and technology studies, and Science of science policy
Because we think all research should be free and open, OpenAlex is free and open itself, and we’re built on a fully Open Source codebase.
We believe the global research system is one of humankind’s most beautiful creations. OpenAlex aims to make that whole beautiful creation available to everyone, everywhere….
Let’s start way back at the beginning, with the ancient Library of Alexandria. Working create a Universal Library, they didn’t just gather knowledge — they made it useful by indexing it in the world’s first library catalog, the Pinakes. That’s what we’re trying to do, too, and so our name is an homage to them!
Fast forward a few millenia: OpenAlex had been a dream at our little nonprofit for a long time, but two doors opened simultaneously in May 2021. First, we received a generous, $4.5M grant from Arcadia, a charitable fund of Lisbet Rausing and Peter Baldwin. Second, Microsoft announced it would shutter Microsoft Academic Graph (MAG), which much of the community has come to rely upon as our best index of scholarly communication. So, we had the means and the opportunity, and we ran with it!…”