Abstract: Background Large language models such as ChatGPT can produce increasingly realistic text, with unknown information on the accuracy and integrity of using these models in scientific writing.
Methods We gathered ten research abstracts from five high impact factor medical journals (n=50) and asked ChatGPT to generate research abstracts based on their titles and journals. We evaluated the abstracts using an artificial intelligence (AI) output detector, plagiarism detector, and had blinded human reviewers try to distinguish whether abstracts were original or generated.
Results All ChatGPT-generated abstracts were written clearly but only 8% correctly followed the specific journal’s formatting requirements. Most generated abstracts were detected using the AI output detector, with scores (higher meaning more likely to be generated) of median [interquartile range] of 99.98% [12.73, 99.98] compared with very low probability of AI-generated output in the original abstracts of 0.02% [0.02, 0.09]. The AUROC of the AI output detector was 0.94. Generated abstracts scored very high on originality using the plagiarism detector (100% [100, 100] originality). Generated abstracts had a similar patient cohort size as original abstracts, though the exact numbers were fabricated. When given a mixture of original and general abstracts, blinded human reviewers correctly identified 68% of generated abstracts as being generated by ChatGPT, but incorrectly identified 14% of original abstracts as being generated. Reviewers indicated that it was surprisingly difficult to differentiate between the two, but that the generated abstracts were vaguer and had a formulaic feel to the writing.
Conclusion ChatGPT writes believable scientific abstracts, though with completely generated data. These are original without any plagiarism detected but are often identifiable using an AI output detector and skeptical human reviewers. Abstract evaluation for journals and medical conferences must adapt policy and practice to maintain rigorous scientific standards; we suggest inclusion of AI output detectors in the editorial process and clear disclosure if these technologies are used. The boundaries of ethical and acceptable use of large language models to help scientific writing remain to be determined.
“IOP Publishing (IOPP) has joined the Initiative for Open Abstracts (I4OA), a collaboration between publishers, infrastructure organisations, librarians, and researchers to promote the open availability of abstracts.
IOPP will deposit abstracts of their scholarly communications with Crossref, the not-for-profit Digital Object Identifier (DOI) registration agency bringing together abstracts in a common format in one searchable cross-disciplinary database.
By joining the initiative, IOPP will make all of its abstracts part of the fundamental metadata of the article so that they will be openly available and accessible to the scientific community for unrestricted machine reading. This expanded availability of article abstracts will boost the discoverability of scholarly research and increase their impact. …”
When I4OA was launched one year ago, the initiative was supported by 40 publishers, including Hindawi, Royal Society, and SAGE, who are founding members of the initiative. Among the initial supporters of I4OA there were commercial publishers (e.g., F1000, Frontiers, Hindawi, MDPI, PeerJ, and SAGE), non-profit publishers (e.g., eLife and PLOS), society publishers (e.g., AAAS and Royal Society), and university presses (e.g., Cambridge University Press and MIT Press). Some of the initial supporters of I4OA are open access publishers, while others publish subscription-based journals.
Over the past year, the number of publishers supporting I4OA has more than doubled. The initiative is currently supported by 86 publishers. Publishers that have joined I4OA over the past year include ACM, American Society for Microbiology, Emerald, Oxford University Press, and Thieme. I4OA has also been joined by a substantial number of national and regional publishers, for instance from countries in Latin America, Eastern Europe, and Asia.
“ACM, the Association for Computing Machinery, has joined the Initiative for Open Abstracts (I4OA), a collaboration between publishers, infrastructure organizations, librarians, and researchers to promote the open availability of abstracts.
By joining I4OA, ACM commits to making abstracts of articles published by ACM available in an open and machine-readable way. Abstracts will be submitted to Crossref, initially for journal articles published by ACM and in a next stage also for conference papers. Bringing abstracts together in a common format in a global cross-disciplinary database offers important opportunities for text mining, natural language processing, and artificial intelligence….”
“Last week Elsevier announced that it has signed the San Francisco Declaration on Research Assessment (DORA) and that it is going to make the reference lists of articles openly available in Crossref. In this Q&A, Ludo Waltman shares his perspective on Elsevier’s decision to open its citations….”
“Anyone who goes through the process of screening large amounts of texts such as newspapers, scientific abstracts for a systematic review, or ancient texts, knows how labor intensive this can be. With the rapidly evolving field of Artificial Intelligence (AI), the large amount of manual work can be reduced or even completely replaced by software using active learning.
By using our AI-aided tool, you can not only save time, but you can also increase the quality of your screening process. ASReview enables you to screen more texts than the traditional way of screening in the same amount of time. Which means that you can achieve a higher quality than when you would have used the traditional approach.
Consider the example of systematic reviews, which are “top of the bill” in research. However, the number of scientific papers on any topic is skyrocketing. Since it is of crucial importance for the advancement of science to produce high-quality systematic review articles, sometimes as quickly as possible in times of crisis, we need to find a way to effectively automate this screening process. Before Elas* was there to help you, systematic reviewing was an exhaustive task, often very boring….”
“Emerald Publishing has joined the Initiative for Open Abstracts (I4OA), a cross-publisher initiative whereby scholarly publishers open the abstracts of their publications to allow for unrestricted availability of abstracts to boost the discovery of research. I4OA is also supported by a large number of research funders, libraries and library associations, infrastructure providers, and open science organisations….”
“In a bid to boost the reach and reuse of scientific results, a group of scholarly publishers has pledged to make abstracts of research papers free to read in a cross-disciplinary repository.
Most abstracts are already available on journal websites or on scholarly databases such as PubMed, even if the papers themselves are behind paywalls. But this patchwork limits the reach and visibility of global research, says Ludo Waltman, deputy director of the Centre for Science and Technology Studies at Leiden University in the Netherlands, and coordinator of the initiative for open abstracts, called I4OA.
Publishers involved in I4OA have agreed to submit their article summaries to Crossref, an agency that registers scholarly papers’ unique digital object identifiers (DOIs). Crossref will make the abstracts available in a common format. So far, 52 publishers have signed up to the initiative, including the American Association for the Advancement of Science and the US National Academy of Sciences….”
“A common goal of authors and publishers has long been more readership for their publications.?Traditionally, the abstract was a teaser to encourage the potential reader to buy or subscribe to read the full text. Even in an open access economy, a good abstract can trigger a coveted “download” and even more coveted citation. Why then do many publishers not make their abstracts and other metadata such as references or license information freely accessible in a machine-readable format?”
“The Initiative for Open Abstracts (I4OA) is a collaboration between scholarly publishers, infrastructure organizations, librarians, researchers and other interested parties to advocate and promote the unrestricted availability of the abstracts of the world’s scholarly publications, particularly journal articles and book chapters, in trusted repositories where they are open and machine-accessible. I4OA calls on all scholarly publishers to open the abstracts of their published works, and where possible to submit them to Crossref….”
“The Initiative for Open Abstracts (I4OA) calls on scholarly publishers to open their abstracts, and specifically to deposit them with Crossref. Unrestricted availability of abstracts will boost the discovery of research. 40 publishers have already agreed to support I4OA and to make their abstracts openly available. I4OA is also supported by a large number of research funders, libraries and library associations, infrastructure providers, and open science organizations….”
“One of the tactical questions that often comes up with moving towards more open practice in research is the value of taking small steps vs fighting the large battles. Sometimes big changes occur – and the shift towards open access, although slow is an example of a big shift – but often a set of small steps can help to build towards progress. But there is a tension here as well. Small improvements relieve pressure on the system. How do we address the risk that they reduce progress over all? The key to this is in understanding what those small steps can achieve.
Improving the quality and openness of metadata about scholarly communications is an example where many small steps have been made. Because metadata is infrastructure, underpinning many other systems, it is almost entirely invisible. But the work to make it is not.
We make elements of progress, each of them seemingly quite small, but then in combination they suddenly enable significant change.
What we do within the Curtin Open Knowledge Initiative is possible in large part due to incremental improvements in the infrastructure of persistent identifiers and the quality of open metadata data generally. The improvement in access to open citations data as a result of I4OC has been a major boost to our research allowing us, for instance to make a fair comparison of how a citation count index would perform if it used different bibliographic data sources to define the set of outputs to count citations for.
But where does metadata end and content begin? As a research project we also want to be able to do more granular analysis of the contents of research. Lots of data sources provide a classification of the topics of articles, either at the journal or article level. But mostly these are black boxes that tell us more about who made those classifications than about the things we’re interested in. For instance, in my work I’ve frequently been more interested in categorising articles by the technique that they use, rather than the topic being studied. Sometimes the region a study focuses on is more important than the discipline label. In a perfect world any researcher would be able to process the full text to create their own categorisations, but then we’re restricted to open access content, even assuming we can gather all the content together efficiently. Titles can tell us something, but certainly not enough.
What would make a huge difference is comprehensive and central access to abstracts….”
“In my discipline (e-health literacy), I often find myself debating whether abstracts being freely available to patients is of any real benefit. For researchers and clinicians, abstracts are a great timesaver—enabling a “flick-through” of the copious amounts of new articles for timely follow up. They may also be used by treating physicians and healthcare teams as a starting point for treatment planning and research. But abstracts are not designed to be an independent pathway to inform health decisions for patients lacking the appropriate professional expertise and health literacy skills….”
[Is this an argument against OA for abstracts, or for OA to full-text articles?]