Artificial Intelligence for Public Domain Drug Discovery: Recommendations for Policy Development

“The current drug discovery market is not responding sufficiently to health care needs where it is not adequately lucrative to do so. Unfortunately, there are a number of important yet non-lucrative fields of research in domains including pandemic prevention and antimicrobial resistance, with major current and future costs for society. In these domains, where high-risk public health needs are being met with low R&D investment, government intervention is critical. To maximize the efficiency of the government’s involvement, it is recommended that the government couple its work catalyzing R&D with the creation of a drug development ecosystem that is more conducive to the use of high-impact artificial intelligence (AI) technologies. The scientific and political communities have been ringing alarm-bells over the threat of bacterial resistance to our current antibiotics arsenal and, more generally, the evolving resistance of microbes to existing drugs. Yet, a combination of technical capacity issues and economic barriers has led to an almost complete halt of R&D into treatments that would otherwise address this threat. When a gap arises between what the market is incentivized to produce and the healthcare needs of society, governments must step in. The COVID-19 pandemic illustrates the importance of bridging that gap to ensure we are protected from future threats that would result in similarly devastating consequences. Artificial intelligence (AI) capabilities have contributed to watershed moments across a variety of industries already. The transformative power of AI is showing early signs of success in the drug discovery industry as well. Should AI for drug discovery reach its full potential, it offers the ability to discover new categories of effective drugs, enable intelligent, targeted design of novel therapies, vastly improve the speed and cost of running clinical trials, and further our understanding about the basic science underlying drug and disease mechanics. However, the current drug discovery ecosystem is suboptimal for AI research, and this threatens to limit the positive impact of AI. The field requires a shift towards open data and open science in order to feed the most powerful, data-hungry AI algorithms. This shift will catalyze research in areas of high social impact, such as addressing neglected diseases and developing new antibiotic solutions to incoming drug-resistant threats. Yet, while open science and AI promise successes on producing new compounds, they cannot address the challenges associated with market-failure for certain drug categories. Government interventions to stimulate AI-driven pharmaceutical innovation for these drug categories must therefore target the entire drug development and deployment lifecycle to ensure that the benefits of AI technology, as applied to the pharmaceutical industry, result in strong value added to improve healthcare outcomes for the public….

This document puts forward a set of recommendations that, taken together, task governments with the responsibility to promote: 1. Research and development in fields of drug discovery that are valuable to society and necessary to public health, but for which investments are currently insufficient because of market considerations. 2. Uptake of AI throughout the entire drug discovery and development pipeline. 3. A shift in culture and capabilities towards more open-data among stakeholders in academia and industry when undertaking research on drug discovery and development….”

AI-assisted drug discovery held back by private sector secrecy on datasets | Science|Business

“The discovery of new drugs is being held back because pharmaceutical firms are not sharing their data, limiting the potentially revolutionary impact of artificial intelligence on the field, according to AI experts….

Last year, for example, a team at the Massachusetts Institute of Technology reported discovering a new antibiotic compound using a computer model that can screen more than 100 million compounds in a matter of days.  

But such breakthroughs are being hampered by a lack of data sharing by private companies, stymying efforts to use powerful AI models to improve healthcare, said Yoshua Bengio, an AI pioneer at the University of Montreal and one of the leaders of an OECD-backed investigation into the issue.

“The lack of open datasets is a failure of the principle of profit maximization by individual actors,” he said.

Releasing datasets “hurts their competitiveness, even though it would help the overall market to progress faster to technological solutions,” Bengio said.  …

“The field requires a shift towards open data and open science in order to feed the most powerful, data-hungry AI algorithms,” says Artificial Intelligence for Public Domain Drug Discovery, presented at the annual conference of the Global Partnership on Artificial Intelligence (GPAI), an initiative launched in 2020 under French and Canadian leadership.

In the academic community, data sharing has taken off, and is now mandatory under most government funded grants, said Bengio. Researchers are rewarded through downstream citations if they allow others to use their data.

But the incentives for the private sector are still to keep data closed. Companies need to be encouraged to share their data, “by force of contract and financial rewards for doing the right things”, Bengio said. The GPAI report also calls for government intervention to “strongly encourage” data-sharing….”

Senators unveil bipartisan bill requiring social media giants to open data to researchers | TheHill

“Meta and other social media companies would be required to share their data with outside researchers under a new bill announced by a bipartisan group of senators on Thursday. …

The bill, the Platform Accountability and Transparency Act, would allow independent researchers to submit proposals to the National Science Foundation. If the requests are approved, social media companies would be required to provide the necessary data subject to certain privacy protections. …”

FDA looks on while major U.S. institutions violate medical research rules

“The FDA has issued warnings to only a handful of the companies and institutions with the worst track records of violating a key clinical trial disclosure law, a new report finds.



Out of 51 large US-based companies and institutions that have failed to make five or more clinical trial results public, only three have been contacted by the U.S. drug regulator, and only one has received a final warning, FDA enforcement data show.



Failing to rapidly make clinical trial results public on the American trial registry harms patients because it slows down medical progress, leaves gaps in the medical evidence base, and wastes public funds. …”

Giving drug researchers control of their data

“Drug industry–led efforts, like the Allotrope Foundation, have advanced common terms for data management, Plasterer says. Most recently, the FAIR principles—guidelines for ensuring data in storage are findable, accessible, interoperable, and reusable—have been adopted by drug companies including AstraZeneca and Pfizer….”

Yes, Alternative Proteins Really Do… | The Breakthrough Institute

“Federally funded research dramatically lowers barriers for scientists in the public and private sector to conduct research and accelerate technological development. Unlike its private counterpart, federally funded research can be open-access and makes knowledge and technologies publicly available. Open-access research benefits everyone, companies and academic researchers alike, and would prevent the siloing of intellectual property within specific companies. Such non-proprietary technology and knowledge can help bring new competitors into the market and help drive both competition and further innovation based on the open-access findings. Although open-access research will also benefit incumbents to the industry, federal support to develop the alternative can build the alternative protein industry’s capacity to compete with conventional products in the long term….”

Drug discovery project shows potential of smart openness – Research Professional News

“Commitment to sharing doesn’t mean you can’t work with industry, say Hamish Evans and colleagues

There are several ways for scientific research and innovation to have an impact on society. Different routes to impact are, however, often seen as being in tension. In particular, commercialisation and open science can sometimes seem to be mutually exclusive….”

Does open access to academic research help small, science-based companies? | Emerald Insight

Abstract:  Purpose

This study investigates the extent to which a company’s usage of open access (OA) literature for R&D activities depends on its size. The authors’ assumption is that smaller pharmaceutical companies have less access to (usually expensive) journal subscriptions.


A fixed-effect Poisson model was used to study a panel dataset of USPTO pharmaceutical company patents. The dependent variable is the count of citations to OA resources in a given company patent.


Results support current anecdotal evidence that many SMEs suffer from high journal prices.


This result justifies the assumption made by policymakers about the potentially positive impact OA mandates have on national innovation activity. It was also shown that collaborating with universities can be a potential coping mechanism for companies that struggle to gain access to the journals they need. In addition to the novelty of its findings, this study introduces a new way to study the impact of OA in nonacademic contexts.

SocArXiv Papers | Dynamics of Cumulative Advantage and Threats to Equity in Open Science – A Scoping Review

Open Science holds the promise to make scientific endeavours more inclusive, participatory, understandable, accessible, and re-usable for large audiences. However, making processes open will not per se drive wide re-use or participation unless also accompanied by the capacity (in terms of knowledge, skills, financial resources, technological readiness and motivation) to do so. These capacities vary considerably across regions, institutions and demographics. Those advantaged by such factors will remain potentially privileged, putting Open Science’s agenda of inclusivity at risk of propagating conditions of “cumulative advantage”. With this paper, we systematically scope existing research addressing the question: “What evidence and discourse exists in the literature about the ways in which dynamics and structures of inequality could persist or be exacerbated in the transition to Open Science, across disciplines, regions and demographics?” Aiming to synthesise findings, identify gaps in the literature, and inform future research and policy, our results identify threats to equity associated with all aspects of Open Science, including Open Access, Open/FAIR Data, Open Methods, Open Evaluation, Citizen Science, as well as its interfaces with society, industry and policy. Key threats include: stratifications of publishing due to the exclusionary nature of the author-pays model of Open Access; potential widening of the digital divide due to the infrastructure-dependent, highly situated nature of open data practices; risks of diminishing qualitative methodologies as “reproducibility” becomes synonymous with quality; new risks of bias and exclusion in means of transparent evaluation; and crucial asymmetries in the Open Science relationships with industry and the public, which privileges the former and fails to fully include the latter.

Accurate, open data is crucial to cross-sector grid planning and disaster prevention – Geospatial World

“A particularly promising example of the kind of collective, cross-sector response needed to address this issue comes in the form of utility companies opening grid data up to competitors and even customers. Western Power Distribution has launched an open-access web portal offering detailed data on everything from consumption to generation across its network. The City of London is also working with utility companies to create a combined on-demand digital map of its subterranean pipes and cables where workers can see nearby underground infrastructure on mobile phones or laptop computers before a dig.

Geospatial data on the location and condition of frozen gas pipes could help to protect other underground infrastructure and avert disasters. Data predicting how vegetation growth might impact electricity lines could help a telecoms network operator anticipate potential interference with millimeter waves from nearby 5G antennae. In another example, we are working to integrate IBM Weather Group’s LIDAR and satellite data with geospatial network information to help electrical utilities predict and prevent encroachment on electric transmission and distribution lines….

The trend towards data sharing requires an industry-wide step-change in the capture and curation of data to ensure all companies have a comprehensive, current picture of their networks and use geospatial information systems built around open design principles. This would ensure a consistent standard of network data is captured and shared across the industry. Rich, real-time, and open data can help foster a utility sector built around cooperation that facilitates a higher standard of network resilience despite the challenging environmental issues we face today.”

Joint Statement on transparency and data integrity International Coalition of Medicines Regulatory Authorities (ICMRA) and WHO

“ICMRA1 and WHO call on the pharmaceutical industry to provide wide access to clinical data for all new medicines and vaccines (whether full or conditional approval, under emergency use, or rejected). Clinical trial reports should be published without redaction of confidential information for reasons of overriding public health interest….

Regulators continue to spend considerable resources negotiating transparency with sponsors. Both positive and negative clinically relevant data should be made available, while only personal data and individual patient data should be redacted. In any case, aggregated data are unlikely to lead to re-identification of personal data and techniques of anonymisation can be used….


Providing systematic public access to data supporting approvals and rejections of medicines reviewed by regulators, is long overdue despite existing initiatives, such as those from the European Medicines Agency and Health Canada. The COVID-19 pandemic has revealed how essential to public trust access to data is. ICMRA and WHO call on the pharmaceutical industry to commit, within short timelines, and without waiting for legal changes, to provide voluntary unrestricted access to trial results data for the benefit of public health.”



Frontiers | Open Science for private Interests? How the Logic of Open Science Contributes to the Commercialization of Research | Research Metrics and Analytics

Abstract:  Financial conflicts of interest, several cases of scientific fraud, and research limitations from strong intellectual property laws have all led to questioning the epistemic and social justice appropriateness of industry-funded research. At first sight, the ideal of Open Science, which promotes transparency, sharing, collaboration, and accountability, seems to target precisely the type of limitations uncovered in commercially-driven research. The Open Science movement, however, has primarily focused on publicly funded research, has actively encouraged liaisons with the private sector, and has also created new strategies for commercializing science. As a consequence, I argue that Open Science ends up contributing to the commercialization of science, instead of overcoming its limitations. I use the examples of research publications and citizen science to illustrate this point. Accordingly, the asymmetry between private and public science, present in the current plea to open science, ends up compromising the values of transparency, democracy, and accountability.


Promoting versatile vaccine development for emerging pandemics | npj Vaccines

“In this case, the generated knowledge is a clear example of a positive spillover that creates a need for public intervention into the market for research and development. However, this relies on the results of translatable work on prototype pathogens—such as insights into antigen optimisation—being accessible to public use. Therefore, public funding of prototype pathogen work should seek to promote research that generates openly accessible and translatable insights as far as practicable, while also judiciously taking advantage of generating proprietary intellectual property. Even in the cases where a proprietary insight might primarily benefit the originating organisation, such as early preclinical evidence and safety data from clinical trials, the research remains worthy of subsidy because society benefits from having developers that are better prepared to respond to emerging infectious diseases….”

Bill Gates, Vaccine Monster | The New Republic

“Battle-scarred veterans of the medicines-access and open-science movements hoped the immensity of the pandemic would override a global drug system based on proprietary science and market monopolies. By March, strange but welcome melodies could be heard from unexpected quarters. Anxious governments spoke of shared interests and global public goods; drug companies pledged “precompetitive” and “no-profit” approaches to development and pricing. The early days featured tantalizing glimpses of an open-science, cooperative pandemic response. In January and February 2020, a consortium led by the National Institutes of Health and the National Institute of Allergy and Infectious Diseases collaborated to produce atomic-level maps of the key viral proteins in record time. “Work that would normally have taken months—or possibly even years—has been completed in weeks,” noted the editors of Nature. …

By then, however, the optimism and sense of possibility that defined the early days were long gone. Advocates for pooling and open science, who seemed ascendant and even unstoppable that winter, confronted the possibility they’d been outmatched and outmaneuvered by the most powerful man in global public health.

In April, Bill Gates launched a bold bid to manage the world’s scientific response to the pandemic. Gates’s Covid-19 ACT-Accelerator expressed a status quo vision for organizing the research, development, manufacture, and distribution of treatments and vaccines. Like other Gates-funded institutions in the public health arena, the Accelerator was a public-private partnership based on charity and industry enticements. Crucially, and in contrast to the C-TAP, the Accelerator enshrined Gates’s long-standing commitment to respecting exclusive intellectual property claims. Its implicit arguments—that intellectual property rights won’t present problems for meeting global demand or ensuring equitable access, and that they must be protected, even during a pandemic—carried the enormous weight of Gates’s reputation as a wise, beneficent, and prophetic leader. …

“Early on, there was space for Gates to have a major impact in favor of open models,” says Manuel Martin, a policy adviser to the Médecins Sans Frontières Access Campaign. “But senior people in the Gates organization very clearly sent out the message: Pooling was unnecessary and counterproductive. They dampened early enthusiasm by saying that I.P. is not an access barrier in vaccines. That’s just demonstratively false.”…

“Things could have gone either way,” says Love, “but Gates wanted exclusive rights maintained. He acted fast to stop the push for sharing the knowledge needed to make the products—the know-how, the data, the cell lines, the tech transfer, the transparency that is critically important in a dozen ways. The pooling approach represented by C-TAP included all of that. Instead of backing those early discussions, he raced ahead and signaled support for business-as-usual on intellectual property by announcing the ACT-Accelerator in March.” …”