Cracking double-blind review: Authorship attribution with deep learning | PLOS ONE

Abstract:  Double-blind peer review is considered a pillar of academic research because it is perceived to ensure a fair, unbiased, and fact-centered scientific discussion. Yet, experienced researchers can often correctly guess from which research group an anonymous submission originates, biasing the peer-review process. In this work, we present a transformer-based, neural-network architecture that only uses the text content and the author names in the bibliography to attribute an anonymous manuscript to an author. To train and evaluate our method, we created the largest authorship-identification dataset to date. It leverages all research papers publicly available on arXiv amounting to over 2 million manuscripts. In arXiv-subsets with up to 2,000 different authors, our method achieves an unprecedented authorship attribution accuracy, where up to 73% of papers are attributed correctly. We present a scaling analysis to highlight the applicability of the proposed method to even larger datasets when sufficient compute capabilities are more widely available to the academic community. Furthermore, we analyze the attribution accuracy in settings where the goal is to identify all authors of an anonymous manuscript. Thanks to our method, we are not only able to predict the author of an anonymous work but we also provide empirical evidence of the key aspects that make a paper attributable. We have open-sourced the necessary tools to reproduce our experiments.

 

Science Publishing Innovation: Why Do So Many Good Ideas Fail? – Science Editor

“Over a decade ago, BioMed Central (BMC) recognized the importance of postpublication discussion. Prepublication review can improve papers and catch errors, but only time and subsequent work of other scientists can truly show which results in a publication are robust and valid. Unlike a print journal (or print as a medium, in general), the Internet permits the readers to comment on published papers over time. So in 2002 BMC developed and enabled commenting on every one of its articles across its suite of journals. Not only does this allow for postpublication review, but it enables readers to easily ask authors and other readers a question, with public responses enriching the original manuscript, clarifying, and helping to improve the comprehension of the work.

This is a terrific idea, but it didn’t really catch on….

Remarkably, despite the creation of arXiv for physicists in 1990 and despite the enthusiastic embrace of preprints by the physics community, it has been assumed this is impossible for biology. The common argument is that biologists are different from physicists and the arXiv success is not informative. What many did find telling is the death of the 2007 preprint initiative from the Nature Publishing Group (NPG). NPG tried preprints with Nature Precedings, but adoption was low and in 2012 NPG pulled the plug on the experiment.3 This triggered some skepticism about the prospects of the bioRxiv preprint effort from Cold Spring Harbor Lab (CSHL) Press.4 Critics told the director of CSHL Press, John Inglis, that a preprint for biologists simply couldn’t work.5

Once again, we must ask the cause of the Nature Precedings failure. Did NPG kill it because biologists wouldn’t behave in the same way as physicists? We know that isn’t the case. Preprints in biology are all the rage today….

In the winter of 2012, Alexei Stoliartchouk and I came up with the idea for protocols.io—a central place where scientists can share and discover science methods. We wanted to create a site where corrections and the constant tweaking of science methods could be shared, even after publication in a journal….

Few people know about bioprotocols.com, but many know about OpenWetWare (OWW) and Nature Protocol Exchange—both open-access community resources for sharing protocols. Both have been mentioned to me countless times as evidence that protocols.io wouldn’t work. As with preprints, the problems that OWW and Protocol Exchange faced seemed to be proof that biologists would not share details of their methods on such a platform. As with bioRxiv, we are in the early days of protocols.io, but judging from the growth in the figure below, it’s hard to argue that biologists don’t need this or that they won’t take the time to publicly share their methods….”

Integrating AI into Kotahi: How to Quantify Its Contribution? | Coko

Written by Paul Shannon, John Chodacki, Nokome Bentley, Adam Hyde, Ryan Dix-Peek, Yannis Barlas and Ben Whitmore

Recently, a team of technologists from across the globe came together in New Zealand to brainstorm how to integrate AI into the peer review process using Kotahi responsibly. We recognize the potential benefits that AI can bring, such as increased efficiency and accuracy, but we also acknowledge the need to be thoughtful about how we implement this technology. As a result, we engaged in a collaborative brainstorming session to explore how AI could be integrated into our platform responsibly and effectively. In this article, we will share some of our key insights and considerations from this session.

As the scientific publishing landscape continues to evolve, many are looking to AI as a potential solution to streamline the publishing process. At Kotahi, we’ve been thinking about how AI can be integrated into our platform to improve the efficiency and accuracy of the whole process.

Types of AI contributions

One question we’ve been grappling with is how to be transparent and honest about AI’s contribution and quantify its impact. Our approach was to first distinguish between different applications of AI technologies so we can better distinguish boundaries for where they are appropriate:

Computer-assisted humans use AI to help with certain tasks, such as identifying potential conflicts of interest or suggesting potential reviewers.
Generative AI, on the other hand, can create original content, such as writing summaries or even entire manuscripts.

By establishing this foundational distinction between these two approaches, we have a clearer understanding of the role that AI should play in the publication process.

Publishing is collaboration

While AI can help with certain tasks, such as identifying potential reviewers, it is important to remember that humans must continue to play the primary role in the publishing process. Authors and reviewers provide invaluable feedback and insights that cannot be replicated by AI alone. Therefore, we need to find a way to integrate AI into the publishing process transparently that also does not diminish the importance of human input.

One potential solution that maintains this distinction is to, for example, offer AI assistance to a reviewer directly in the reviewer form (as an opt-in) to help them turn their review notes (possibly in bullet points) into readable sentences and paragraphs that use a constructive, respectful tone suitable for a review. This would allow reviewers to choose whether or not they want to use AI to assist them in their reviews while also offering transparency about the use of AI in the review process.

 

Advancing Software Citation Implementation (Software Citation Workshop 2022)

Abstract:  Software is foundationally important to scientific and social progress, however, traditional acknowledgment of the use of others’ work has not adapted in step with the rapid development and use of software in research.

This report outlines a series of collaborative discussions that brought together an international group of stakeholders and experts representing many communities, forms of labor, and expertise. Participants addressed specific challenges about software citation that have so far gone unresolved. The discussions took place in summer 2022 both online and in-person and involved a total of 51 participants.

The activities described in this paper were intended to identify and prioritize specific software citation problems, develop (potential) interventions, and lay out a series of mutually supporting approaches to address them. The outcomes of this report will be useful for the GLAM (Galleries, Libraries, Archives, Museums) community, repository managers and curators, research software developers, and publishers.

GitHub is Sued, and We May Learn Something About Creative Commons Licensing – The Scholarly Kitchen

“I have had people tell me with doctrinal certainty that Creative Commons licenses allow text and data mining, and insofar as license terms are observed, I agree. The making of copies to perform text and data mining, machine learning, and AI training (collectively “TDM”) without additional licensing is authorized for commercial and non-commercial purposes under CC BY, and for non-commercial purposes under CC BY-NC. (Full disclosure: CCC offers RightFind XML, a service that supports licensed commercial access to full-text articles for TDM with value-added capabilities.)

I have long wondered, however, about the interplay between the attribution requirement (i.e., the “BY” in CC BY) and TDM. After all, the bargain with those licenses is that the author allows reuse, typically at no cost, but requires attribution. Attribution under the CC licenses may be the author’s primary benefit and motivation, as few authors would agree to offer the licenses without credit.

In the TDM context, this raises interesting questions:

Does the attribution requirement mean that the author’s information may not be removed as a data element from the content, even if inclusion might frustrate the TDM exercise or introduce noise into the system?
Does the attribution need to be included in the data set at every stage?
Does the result of the mining need to include attribution, even if hundreds of thousands of CC BY works were mined and the output does not include content from individual works?

While these questions may have once seemed theoretical, that is no longer the case. An analogous situation involving open software licenses (GNU and the like) is now being litigated….”

Principles of Transparency and Best Practice in Scholarly Publishing – OASPA

“The Committee on Publication Ethics (COPE), the Directory of Open Access Journals (DOAJ), the Open Access Scholarly Publishers Association (OASPA), and the World Association of Medical Editors (WAME) are scholarly organisations that have seen an increase in the number, and broad range in the quality, of membership applications. Our organisations have collaborated to identify principles of transparency and best practice for scholarly publications and to clarify that these principles form the basis of the criteria by which suitability for membership is assessed by COPE, DOAJ and OASPA, and part of the criteria on which membership applications are evaluated by WAME. Each organisation also has their own, additional criteria which are used when evaluating applications. The organisations will not share lists of or journals that failed to demonstrate that they met the criteria for transparency and best practice.

This is the third version of a work in progress (published January 2018); the first version was made available by OASPA in December 2013 and a second version in June 2015. We encourage its wide dissemination and continue to welcome feedback on the general principles and the specific criteria. Background on the organisations is below….”

Revised principles of transparency and best practice released | OASPA

A revised version of the Principles of Transparency and Best Practice in Scholarly Publishing has been released by four key scholarly publishing organizations today. These guiding principles are intended as a foundation for best practice in scholarly publishing to help existing and new journals reach the best possible standards. 

The fourth edition of the Principles represents a collective effort between the four organizations to align the principles with today’s scholarly publishing landscape. The last update was in 2018, and the scholarly publishing landscape has changed. Guidance is provided on the information that should be made available on websites, peer review, access, author fees and publication ethics. The principles also cover ownership and management, copyright and licensing, and editorial policies. They stress the need for inclusivity in scholarly publishing and emphasize that editorial decisions should be based on merit and not affected by factors such as the origins of the manuscript and the nationality, political beliefs or religion of the author.

 

Open peer review is the key to tackling public health misinformation | Times Higher Education (THE)

“Digital, open access publication helped scientists share data and collaborate for the good of global public health. Greater open data policies have improved access to the data underpinning research discoveries. Meanwhile, preprints have enabled faster communication of research findings. But these shifts have not, by themselves, dispelled the conspiracies. I would also argue that the missing piece of the jigsaw is open, de-anonymised peer review….”

What senior academics can do to support reproducible and open research: a short, three-step guide | BMC Research Notes | Full Text

Abstract:  Increasingly, policies are being introduced to reward and recognise open research practices, while the adoption of such practices into research routines is being facilitated by many grassroots initiatives. However, despite this widespread endorsement and support, as well as various efforts led by early career researchers, open research is yet to be widely adopted. For open research to become the norm, initiatives should engage academics from all career stages, particularly senior academics (namely senior lecturers, readers, professors) given their routine involvement in determining the quality of research. Senior academics, however, face unique challenges in implementing policy changes and supporting grassroots initiatives. Given that—like all researchers—senior academics are motivated by self-interest, this paper lays out three feasible steps that senior academics can take to improve the quality and productivity of their research, that also serve to engender open research. These steps include changing (a) hiring criteria, (b) how scholarly outputs are credited, and (c) how we fund and publish in line with open research principles. The guidance we provide is accompanied by material for further reading.

 

 

Plan S Archives – iRights.info – Kreativität und Urheberrecht in der digitalen Welt (Open skimming: How scientific publishing changes in the transition to open access)

From Google’s English:

“Access to scientific texts free of charge and freely – this should soon become the standard. Scientific publishers are also trying to take advantage of the transition to Open Access, for example with fees for authors and data tracking. Tilman Reitz analyzes what the open access transformation means for science and what design options there are.”

Why some researchers oppose unrestricted sharing of coronavirus genome data

“Global-south scientists say that an open-access movement led by wealthy nations deprives them of credit and undermines their efforts….

But a growing faction of scientists, mostly from wealthy nations, argues that sequences should be shared on databases with no gatekeeping at all. They say this would allow huge analyses combining hundreds of thousands of genomes from different databases to flow seamlessly, and therefore deliver results more rapidly.

The debate has caught the attention of the US National Institutes of Health (NIH) — which runs its own genome repository, called GenBank — and the Bill & Melinda Gates Foundation, which has considered encouraging grantees to share on sites without such strong protections, Nature has learnt.

But many researchers — particularly those in resource-limited countries — are pushing back. They tell Nature that they see potential for exploitation in this no-strings-attached approach — and that GISAID’s gatekeeping is one of its biggest attractions because it ensures that users who analyse sequences from GISAID acknowledge those who deposited them. The database also requests that users seek to collaborate with the depositors….

Fears of inequitable data use are amplified by the fact that only 0.3% of COVID-19 vaccines have gone to low-income countries. “Imagine Africans working so hard to contribute to a database that’s used to make or update vaccines, and then we don’t get access to the vaccines,” says Christian Happi, a microbiologist at the African Centre of Excellence for Genomics of Infectious Diseases in Ede, Nigeria. “It’s very demoralizing.” …”

Introducing the CC Search Browser Extension

This is part of a series of posts introducing the projects built by open source contributors mentored by Creative Commons during Google Summer of Code (GSoC) 2019. Mayank Nader was one of those contributors and we are grateful for his work on this project.

Creative Commons (CC) is working towards providing easy access to CC-licensed and public domain works. One significant step towards achieving that goal was the release of CC Search in 2019. Through this search and indexing tool, we’re making a plethora of CC-licensed images accessible in one place. As CC Search expands to include more than just images, CC is also developing a suite of applications and interfaces to help users across the world interact, consume, and reuse open access content.

CC Search Extension (1)

The CC Search Browser Extension is one such application. This browser extension is an open-source, lightweight plugin that can be installed and used by anyone with an updated web browser.

Why did we create this browser extension?

Browsers are the gateway to the web, and users often install browser plugins to improve productivity and overall experience. With the CC Search Browser Extension, users can now search for CC-licensed images, download them, and attribute the owner/creator without needing to head over to Flickr, Behance, Rawpixel or any other source of CC-licensed content. The other great feature? The CC Search Browser Extension works across different browsers, providing a familiar and intuitive experience for all users.

Key features of the CC Search Browser Extension: 

  • Search and filter CC-licensed content

You can use the extension filters to filter the content by the source website, types of licenses, and/or use-case.

CC Search Extension (2)

  • One-click attribution

One condition of all CC licenses is attribution. Attributing the owner/creator of CC-licensed content found using the extension is easy with one-click attribution. Both the Rich-text and HTML versions of the attribution are available.

CC Search Extension (3)

  • Download images (and attribution)

Download the image to use it in your works through the extension itself. You can also download the attribution information as a text file along with the image; this can be helpful when downloading multiple images in a single session.

  • Bookmark images

Bookmarking the images will save them in the extension. You can view and remove your bookmarks from the bookmarks section.

CC Search Extension (4)

  • Export and import bookmarks

As a user, you can easily archive and/or transfer your bookmarks. This feature makes sure that the process of archiving and transferring bookmarks is uncomplicated and straightforward.

CC Search Extension (5)

  • User-interface (UI) options available for custom settings

The extension also allows for setting default filters, etc. The “Options” page helps declutter the main popup of the extension, ensuring that it shows only the most necessary information. In the future, this “Options” page will also host additional and updated features.

CC Search Extension (6)

  • Sync your custom settings and bookmarks across devices

Chrome and Firefox have a built-in feature that syncs browser settings and preferences across your logged-in devices. The extension leverages this feature to sync your custom settings and bookmarks. This will make your experience more pleasant and familiar. 

  • Dark Mode

The extension also has a dark mode that you can toggle “on” by clicking the icon in the header. This reduces screen glare and battery consumption. You can set the dark mode as default in the “Options” page.

Future plans and development

  • Find and fix bugs
  • Add a review and feedback tab on the “Options” page
  • Integrate Vocabulary into the extension
  • Develop usability enhancements
  • Remove infinite scrolling and replace it with pagination or voluntary loading
  • Add search syntax for better specificity of results and a search syntax guide
  • Make the code more modular and add more tests
  • Port the features of the CC Search web application that are relevant in the context of the browser plugin

Installation

The latest version of the extension is available for installation via Mozilla Firefox, Google Chrome, and Opera.

Join the community

Community contribution and feedback is an essential part of the development process, so we encourage you to contact us if you have feedback or a specific suggestion. This is an open-source project, you can contribute in the form of bug reports, feature requests, or code contributions.

To install the development version of the extension, read the installation guide on Github.

Finally, come and tell us about your experience on the Creative Commons Slack via the slack channel: #cc-dev-browser-extension.

The post Introducing the CC Search Browser Extension appeared first on Creative Commons.

Introducing the CC Search Browser Extension

This is part of a series of posts introducing the projects built by open source contributors mentored by Creative Commons during Google Summer of Code (GSoC) 2019. Mayank Nader was one of those contributors and we are grateful for his work on this project.

Creative Commons (CC) is working towards providing easy access to CC-licensed and public domain works. One significant step towards achieving that goal was the release of CC Search in 2019. Through this search and indexing tool, we’re making a plethora of CC-licensed images accessible in one place. As CC Search expands to include more than just images, CC is also developing a suite of applications and interfaces to help users across the world interact, consume, and reuse open access content.

CC Search Extension (1)

The CC Search Browser Extension is one such application. This browser extension is an open-source, lightweight plugin that can be installed and used by anyone with an updated web browser.

Why did we create this browser extension?

Browsers are the gateway to the web, and users often install browser plugins to improve productivity and overall experience. With the CC Search Browser Extension, users can now search for CC-licensed images, download them, and attribute the owner/creator without needing to head over to Flickr, Behance, Rawpixel or any other source of CC-licensed content. The other great feature? The CC Search Browser Extension works across different browsers, providing a familiar and intuitive experience for all users.

Key features of the CC Search Browser Extension: 

  • Search and filter CC-licensed content

You can use the extension filters to filter the content by the source website, types of licenses, and/or use-case.

CC Search Extension (2)

  • One-click attribution

One condition of all CC licenses is attribution. Attributing the owner/creator of CC-licensed content found using the extension is easy with one-click attribution. Both the Rich-text and HTML versions of the attribution are available.

CC Search Extension (3)

  • Download images (and attribution)

Download the image to use it in your works through the extension itself. You can also download the attribution information as a text file along with the image; this can be helpful when downloading multiple images in a single session.

  • Bookmark images

Bookmarking the images will save them in the extension. You can view and remove your bookmarks from the bookmarks section.

CC Search Extension (4)

  • Export and import bookmarks

As a user, you can easily archive and/or transfer your bookmarks. This feature makes sure that the process of archiving and transferring bookmarks is uncomplicated and straightforward.

CC Search Extension (5)

  • User-interface (UI) options available for custom settings

The extension also allows for setting default filters, etc. The “Options” page helps declutter the main popup of the extension, ensuring that it shows only the most necessary information. In the future, this “Options” page will also host additional and updated features.

CC Search Extension (6)

  • Sync your custom settings and bookmarks across devices

Chrome and Firefox have a built-in feature that syncs browser settings and preferences across your logged-in devices. The extension leverages this feature to sync your custom settings and bookmarks. This will make your experience more pleasant and familiar. 

  • Dark Mode

The extension also has a dark mode that you can toggle “on” by clicking the icon in the header. This reduces screen glare and battery consumption. You can set the dark mode as default in the “Options” page.

Future plans and development

  • Find and fix bugs
  • Add a review and feedback tab on the “Options” page
  • Integrate Vocabulary into the extension
  • Develop usability enhancements
  • Remove infinite scrolling and replace it with pagination or voluntary loading
  • Add search syntax for better specificity of results and a search syntax guide
  • Make the code more modular and add more tests
  • Port the features of the CC Search web application that are relevant in the context of the browser plugin

Installation

The latest version of the extension is available for installation via Mozilla Firefox, Google Chrome, and Opera.

Join the community

Community contribution and feedback is an essential part of the development process, so we encourage you to contact us if you have feedback or a specific suggestion. This is an open-source project, you can contribute in the form of bug reports, feature requests, or code contributions.

To install the development version of the extension, read the installation guide on Github.

Finally, come and tell us about your experience on the Creative Commons Slack via the slack channel: #cc-dev-browser-extension.

The post Introducing the CC Search Browser Extension appeared first on Creative Commons.

Using ORCID to Re-imagine Research Attribution | ORCID

“The objective of Rescognito is not to “disrupt” or to “dis-intermediate”, but to work with existing scholarly societies and other participants, keeping them at the heart of research evaluation and reputation management. Rescognito does not store content, it is not a social network nor workflow system; it is just a thin layer exclusively focused on recognition of a wide variety of research contributions. 

Using our platform, recognition is attributed using a counter called a “COG” (short for ReCOGnition) and the ORCID iD of the person granting the recognition. By themselves COG totals are a relatively superficial metric; but because they are open, transparent and attributable, we anticipate that layers of analytics, visualization and possibly AI will provide valuable insights into research trends and people.

We use the CRediT taxonomy, supplemented with a continuously-evolving list of home-grown recognition reasons (feedback welcome!) useful for recognizing non-article-based contributions and non-science works in the humanities and arts….

Thanks to ORCID our system can reliably identify research professionals (for example, the aforementioned Stephen Curry, along with his works: https://rescognito.com/0000-0002-0552-8870)….

Rescognito also allows self-recognition as a way to claim/assign CRediT for a previously published work (for example, https://rescognito.com/0000-0002-0673-1360)….”

Breaking down the walls of scientific secrecy | CBC News

Getting scooped by a competing researcher is one of a scientist’s biggest fears. And some of the most important discoveries in medical history have been tainted by competitive controversy.

Back in 1952, before he co-discovered the structure of DNA, James Watson got access to Rosalind Franklin’s revolutionary X-ray image of DNA without her knowledge.

That image, known as Photo 51, was a major clue that helped Watson and Francis Crick complete their Nobel Prize-winning discovery. The lack of credit given to Franklin remains a stain on the story of their breakthrough.

But what if Franklin had been informally publishing her research notes all along?

“She would have gotten credit instantly for her contribution,” said Susan Lamb, a historian of medicine who holds the Hannah Chair in the History of Medicine at the University of Ottawa….”