Abstract: Data-driven computational analysis is becoming increasingly important in biomedical research, as the amount of data being generated continues to grow. However, the lack of practices of sharing research outputs, such as data, source code and methods, affects transparency and reproducibility of studies, which are critical to the advancement of science. Many published studies are not reproducible due to insufficient documentation, code, and data being shared. We conducted a comprehensive analysis of 453 manuscripts published between 2016-2021 and found that 50.1% of them fail to share the analytical code. Even among those that did disclose their code, a vast majority failed to offer additional research outputs, such as data. Furthermore, only one in ten papers organized their code in a structured and reproducible manner. We discovered a significant association between the presence of code availability statements and increased code availability (p=2.71×10 ?9 ). Additionally, a greater proportion of studies conducting secondary analyses were inclined to share their code compared to those conducting primary analyses (p=1.15*10 ?07 ). In light of our findings, we propose raising awareness of code sharing practices and taking immediate steps to enhance code availability to improve reproducibility in biomedical research. By increasing transparency and reproducibility, we can promote scientific rigor, encourage collaboration, and accelerate scientific discoveries. We must prioritize open science practices, including sharing code, data, and other research products, to ensure that biomedical research can be replicated and built upon by others in the scientific community.
Abstract: Computable biomedical knowledge artifacts (CBKs) are software programs that transform input data into practical output. CBKs are expected to play a critical role in the future of learning health systems. While there has been rapid growth in the development of CBKs, broad adoption is hampered by limited verification, documentation, and dissemination channels. To address these issues, the Learning Health Systems journal created a track dedicated to publishing CBKs through a peer-review process. Peer review of CBKs should improve reproducibility, reuse, trust, and recognition in biomedical fields, contributing to learning health systems. This special issue introduces the CBK track with four manuscripts reporting a functioning CBK, and another four manuscripts tackling methodological, policy, deployment, and platform issues related to fostering a healthy ecosystem for CBKs. It is our hope that the potential of CBKs exemplified and highlighted by these quality publications will encourage scientists within learning health systems and related biomedical fields to engage with this new form of scientific discourse.
“Since October 2012, Wikidata has evolved a lot to become one of the most important open knowledge graphs, providing semantic knowledge about various topics in multiple languages. This effort includes the development of quality information for Biomedicine that can be reused for clinical decision support among other very important tasks.
In 2019, we conducted a research study to assess the coverage of health-related information in Wikidata and we found that it lacks support of various important types of information and that a significant set of biomedical relations has a limited precision and is not linked to references. Despite the use of crowdsourcing and human editing, the situation does not evolve as it should be. We needed a hack to change all the game.
MeSH Keywords as a valuable resource
MeSH (Medical Subject Headings) keywords play a pivotal role in the realm of biomedical knowledge representation, making them a valuable resource in various aspects of healthcare research and practice. It is composed of a heading providing the main topic of a research paper and a qualifier identifying the facet of the topic that is discussed by the paper….”
Abstract: With the development of artificial intelligence (AI) technologies, biomedical imaging data play an important role in scientific research and clinical application, but the available resources are limited. Here we present Open Biomedical Imaging Archive (OBIA), a repository for archiving biomedical imaging and related clinical data. OBIA adopts five data objects (Collection, Individual, Study, Series, and Image) for data organization, accepts the submission of biomedical images of multiple modalities, organs, and diseases. In order to protect personal privacy, OBIA has formulated a unified de-identification and quality control process. In addition, OBIA provides friendly and intuitive web interface for data submission, browsing and retrieval, as well as image retrieval. As of September 2023, OBIA has housed data for a total of 937 individuals, 4136 studies, 24,701 series, and 1,938,309 images covering 9 modalities and 30 anatomical sites. Collectively, OBIA provides a reliable platform for biomedical imaging data management and offers free open access to all publicly available data to support research activities throughout the world. OBIA can be accessed at https://ngdc.cncb.ac.cn/obia.
“Biomaterial sharing offers enormous benefits for research and for the scientific community. Individuals, funders, institutions, and journals can overcome the barriers to sharing and work together to promote a better sharing culture….”
Abstract: Academic journals have been publishing the results of biomedical research for more than 350 years. Reviewing their history reveals that the ways in which journals vet submissions have changed over time, culminating in the relatively recent appearance of the current peer-review process. Journal brand and Impact Factor have meanwhile become quality proxies that are widely used to filter articles and evaluate scientists in a hypercompetitive prestige economy. The Web created the potential for a more decoupled publishing system in which articles are initially disseminated by preprint servers and then undergo evaluation elsewhere. To build this future, we must first understand the roles journals currently play and consider what types of content screening and review are necessary and for which papers. A new, open ecosystem involving preprint servers, journals, independent content-vetting initiatives, and curation services could provide more multidimensional signals for papers and avoid the current conflation of trust, quality, and impact. Academia should strive to avoid the alternative scenario, however, in which stratified publisher silos lock in submissions and simply perpetuate this conflation.
Abstract: Background Preprints are scientific manuscripts that are made available on open-access servers but are not yet peer reviewed. While preprints are becoming more prevalent uptake is not uniform or optimal. Understanding researchers’ opinions and attitudes towards preprints is valuable to their successful implementation. Understanding knowledge gaps and researchers’ attitudes toward preprinting can assist stakeholders like journals, funding agencies, and universities to implement preprints more effectively. Here, we aim to collect perceptions and behaviours regarding preprints in across an international sample of biomedical researchers.
Methods Biomedical authors were identified by a keyword-based, systematic search from the MEDLINE database, and their emails were extracted to invite them to our survey. A cross-sectional anonymous survey was distributed to all identified biomedical authors to collect their knowledge, attitudes, and opinions about preprinting.
Results The survey was completed by 730 biomedical researchers with a response rate of 3.20% and demonstrated a wide range of attitudes and opinions about preprints with authors from various disciplines and career stages around the world. Most respondents were familiar with the concept of preprints, but most had not published a preprint before. The lead author of the project and journal policy had the most impact on decisions to post a preprint, while employers/research institute had the least impact. Supporting open science practices was the highest ranked incentive, while increases to authors’ visibility was highest ranked motivation for publishing preprints.
Conclusion While many biomedical researchers recognize the benefits of preprints, there is still hesitation among others to engage in this practice. This may be due to the general lack of peer review of preprints and little enthusiasm from external organizations, such as journals, funding agencies, and universities. Future work is needed to determine optimal ways to increase researcher’s attitudes through modifications to current preprint systems and policies.
“Today’s announcements from the Biden Cancer Moonshot include: …A new “biomedical data fabric toolbox” to advance cancer research progress. ARPA-H is partnering with the National Institutes of Health, the National Cancer Institute (NCI), and other agencies to develop a new Biomedical Data Fabric Toolbox for Cancer. Starting with cancer datasets, this program represents the first step toward transforming data accessibility across all medical domains…”
Abstract: Preprints, manuscripts posted online prior to journal-organized peer-review, are an alternative to the traditional, slow, expensive, and inequitable journal publication system. They enable earlier sharing of research outcomes and can even be used to obtain early feedback on work in progress. We aim to identify such alternative uses of preprints, including works-in-progress that are deposited and updated as preprints (iterative preprints). We rely first on a computational approach to identify alternative preprints that we then qualitatively assess. We aim to communicate our approach and results to the community as an iterative preprint itself. In this version, we present our computational approach and initial exploratory results as we seek feedback on our methodology.
“The purpose of this Request for Information (RFI) is to solicit public comments on the use of Real-World Data (RWD), including Electronic Health Records, for Biomedical and Behavioral Research….
Researchers are increasingly using data collected in real-world settings to augment traditional research studies as well as develop more effective treatments and interventions for patients. These “real-world data (RWD)”, defined by the U.S. Food and Drug Administration, are data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources. Examples of RWD include data derived from electronic health records, medical claims data, data from product or disease registries, and data gathered from other sources (such as digital health technologies) that can inform on health status. While these data hold tremendous promise for biomedical and behavioral research, they can be collected from a variety of sources through multiple mechanisms, creating challenges for researchers and questions for those whose data are being shared.
Importantly, the National Institutes of Health (NIH) is committed to ensuring participant privacy and autonomy are protected in all NIH supported research. As NIH establishes health-related research data platforms that include access to RWD, NIH continues to prioritize maximizing data access while upholding participant preferences regarding the collection and use of their data. Most recently, through an NIH Director Advisory Committee, NIH met with stakeholders to understand their perspectives on benefits and risks of combining and using human datasets, particularly from disparate sources (e.g., research and non-research settings) and how their data should be used in biomedical research. NIH will continue working to incorporate these perspectives in its research studies to build trust and honor participant preferences. Input requested on this RFI will be used to inform NIH’s continuing development of guidance on the use of RWD for research and assist in the planning for appropriate mechanisms and programs for research with RWD….”
Abstract: Rapid developments and methodological divides hinder the study of how scientific knowledge accumulates, consolidates and transfers to the public sphere. Our work proposes using Wikipedia, the online encyclopedia, as a historiographical source for contemporary science. We chose the high-profile field of gene editing as our test case, performing a historical analysis of the English-language Wikipedia articles on CRISPR. Using a mixed-method approach, we qualitatively and quantitatively analyzed the CRISPR article’s text, sections and references, alongside 50 affiliated articles. These, we found, documented the CRISPR field’s maturation from a fundamental scientific discovery to a biotechnological revolution with vast social and cultural implications. We developed automated tools to support such research and demonstrated its applicability to two other scientific fields–coronavirus and circadian clocks. Our method utilizes Wikipedia as a digital and free archive, showing it can document the incremental growth of knowledge and the manner scientific research accumulates and translates into public discourse. Using Wikipedia in this manner compliments and overcomes some issues with contemporary histories and can also augment existing bibliometric research.
“Wellcome Open Research provides Wellcome-funded researchers a place to rapidly publish any of their results, including data sets, negative results, protocols, case reports, incremental findings as well as more traditional articles.
Wellcome Open Research publishes original research on all topics that receive grant funding from Wellcome. This includes:
humanities and social sciences
public engagement and arts projects, where it includes original research…”
“We are network of collaborators trying to keep track and curate interesting open source projects related to neurosciences. If you have a project that you’d like to see listed here or if you know of a project that should be listed, drop us a line, via E-mail, or Twitter.”
Abstract: Data-driven computational analysis is becoming increasingly important in biomedical research, as the amount of data being generated continues to grow. However, the lack of practices of sharing research outputs, such as data, source code and methods, affects transparency and reproducibility of studies, which are critical to the advancement of science. Many published studies are not reproducible due to insufficient documentation, code, and data being shared. We conducted a comprehensive analysis of 453 manuscripts published between 2016-2021 and found that 50.1% of them fail to share the analytical code. Even among those that did disclose their code, a vast majority failed to offer additional research outputs, such as data. Furthermore, only one in ten papers organized their code in a structured and reproducible manner. We discovered a significant association between the presence of code availability statements and increased code availability (p=2.71×10?9). Additionally, a greater proportion of studies conducting secondary analyses were inclined to share their code compared to those conducting primary analyses (p=1.15*10?07). In light of our findings, we propose raising awareness of code sharing practices and taking immediate steps to enhance code availability to improve reproducibility in biomedical research. By increasing transparency and reproducibility, we can promote scientific rigor, encourage collaboration, and accelerate scientific discoveries. We must prioritize open science practices, including sharing code, data, and other research products, to ensure that biomedical research can be replicated and built upon by others in the scientific community.