“Nightingale Open Science is a platform that connects researchers with world-class medical data. We work closely with health systems around the world to create and curate datasets of medical images linked to ground-truth labels. We carefully deidentify the data and make it available for non-profit research on our cloud infrastructure….
Unfortunately, existing medical data with the potential to shed light on these patterns have historically been siloed. By making this data accessible to broad groups of interdisciplinary researchers, we can begin to unlock discoveries that save lives, surfacing previously unknown patterns of disease….”
“The European Open Science Cloud (EOSC) is a European Commission (EC) initiative to support the development of open science and the digital transformation of research in Europe and further afield. Now in its implementation phase, it aims to develop a “web” of FAIR data and services, providing a multi-disciplinary environment where researchers can publish, find and re-use data, tools and services. The EOSC is complementary to UK efforts to define and adopt open science policies and practices, and the UK contributes to development of the EOSC through participation in implementation projects and in the EOSC Association, a legal entity established to govern the European Open Science Cloud.
As part of its Tech 2 Tech series, Jisc held an EOSC webinar in March 2021 which helped to confirm strong interest in the EOSC across the UK research community. Another Jisc webinar about EOSC will be held on 15 December. This blog provides an update on the numerous activities which have been taking place as part of the ongoing development of the EOSC, and UK engagement with them….
“Virtual SciDataCon 2021 is organised around a number of thematic strands. This is the third of a series of announcements presenting these strands to the global data community. Please note that registration is free, but participants must register for each session they wish to attend.
For some time there has been recognition of the need for investment in domain specific research infrastructures at a national and sometimes regional level. In recent years, in some countries and regions, there has been a move towards research infrastructures that are both vertically and horizontally integrated: vertically, in the sense that they aim to bring generic e-infrastructure closer to research communities’ needs; horizontally, in the sense that they explicitly aim, by embracing principles of Open Science and FAIR data, to better facilitate interdisciplinary research. Examples include, but are not limited to, the European Open Science Cloud (EOSC), the China Science and Technology Cloud (CSTCloud), the Australian Research Data Commons (ARDC), the Malaysian Open Science Platform, the African Open Science Platform, the planned broadening of LA Referencia in Latin America, as well as Canada’s NDRIO and Germany’s NFDI The major international data organisation that collaborate in Data Together have complementary activities to define a model for Open Research Commons and to encourage cooperation, alignment and interoperability between Open Science Clouds….”
“Session Title: Developing Cooperation and Alignment Between Open Science Clouds: governance and sustainability, policy and legal, technical infrastructure, data interoperability
Session Organisers: Simon Hodson
Register for the session: https://us02web.zoom.us/meeting/register/tZ0lf-CpqzojG9OqmXdz54QFxeU639vMCrzo
This interactive workshop session will provide an overview of the activities of four thematic working groups established by the Global Open Science Cloud project. Each Working Group will give a short presentation, focusing on the areas which it has identified to share information, develop cooperation and to explore alignment. The presentations will be followed by structured discussion. We invite participants to make recommendations for this work and to help identify areas where cooperation can be supported by the Working Groups….”
“The agreements regulate resources and services necessary for the collection, processing, storage, dissemination and availability of research data.
This initiative is the result of years-long joint efforts of many stakeholders from the science and tertiary education in the open science movement, and the initiative was launched with the support of the Ministry of Science and the Croatian Science Foundation.
It creates preconditions for developing the Croatian open science cloud that will enable coordinated development of the country’s e-infrastructure.
The initiative will bring together relevant stakeholders in creating required preconditions for the implementation, realisation, and promotion of open science….”
“Until now, the most advanced climate models have mostly been available to researchers in the wealthiest countries.
New program will see Amazon Web Services’ advanced cloud technologies host 30 climate model simulations and make them available to researchers around the globe….
The resulting free, open access dataset will allow research teams internationally to skirt one of the major barriers to specialized climate modeling, even for those who have the computing capacity to make it happen: cost. Wanser said running the 30 simultaneous simulations would normally cost roughly $700,000, and take two months to run.
The AWS program will cover all costs associated with hosting and sharing data from the cloud, and accessing and downloading it will be free. Grants will be available to users who choose to analyze or run additional models on AWS.”
“Three renowned researchers in digital humanities and computer science are joining forces with the Library of Congress on three inaugural Computing Cultural Heritage in the Cloud projects, exploring how biblical quotations, photographic styles and “fuzzy searches” reveal more about the collections in the world’s largest Library than first meets the eye.
Supported by a $1 million grant from the Andrew W. Mellon Foundation awarded in 2019, the initiative combines cutting edge technology with the Library’s vast collections to support digital humanities research at scale. These three outside researchers will collaborate with subject matter experts and technology specialists at the Library of Congress to experiment in pursuit of answers that can only be achieved with collections and data at scale. These collaborations will enable research on questions previously difficult to address due to technical and data constraints. Expanding the skills and knowledge necessary for this work will enable the Library to support emerging methods in cloud-based computing research such as machine learning, computer vision, interactive data visualization, and other areas of digital humanities and computer science research. As a result, the Library and other cultural heritage institutions may build upon or adapt these approaches for their own use in improving access to text and image collections….”
Abstract: Dockstore (https://dockstore.org/) is an open source platform for publishing, sharing, and finding bioinformatics tools and workflows. The platform has facilitated large-scale biomedical research collaborations by using cloud technologies to increase the Findability, Accessibility, Interoperability and Reusability (FAIR) of computational resources, thereby promoting the reproducibility of complex bioinformatics analyses. Dockstore supports a variety of source repositories, analysis frameworks, and language technologies to provide a seamless publishing platform for authors to create a centralized catalogue of scientific software. The ready-to-use packaging of hundreds of tools and workflows, combined with the implementation of interoperability standards, enables users to launch analyses across multiple environments. Dockstore is widely used, more than twenty-five high-profile organizations share analysis collections through the platform in a variety of workflow languages, including the Broad Institute’s GATK best practice and COVID-19 workflows (WDL), nf-core workflows (Nextflow), the Intergalactic Workflow Commission tools (Galaxy), and workflows from Seven Bridges (CWL) to highlight just a few. Here we describe the improvements made over the last four years, including the expansion of system integrations supporting authors, the addition of collaboration features and analysis platform integrations supporting users, and other enhancements that improve the overall scientific reproducibility of Dockstore content.
“In the race to harness the power of cloud computing, and further develop artificial intelligence, academics have a new concern: falling behind a fast-moving tech industry. In the US, 22 higher education institutions, including Stanford and Carnegie Mellon, have signed up to a National Research Cloud initiative seeking access to the computational power they need to keep up. It is one of several cloud projects being called for by academics globally, and is being explored by the US Congress, given the potential of the technology to deliver breakthroughs in healthcare and climate change….”
“In alignment with RDA’s core mission to ‘set international Research Data and Protocol agreements and standards’11 , the RDA Global Open Research Commons Interest Group (GORC IG)12 is helping to support coordination amongst regional, national, pan-national and domain-specific organizations. Those organizations are developing the interoperable resources necessary to enable researchers to address societal grand challenges across disciplines, technologies and countries….
The Global Open Science Cloud (GOSC)13 initiative has its roots in the same series of meetings. It was proposed in 2019 at the CODATA conference in Beijing with the objective to assist the alignment and interoperation of open science cloud activities. GOSC aims to co-design and build a cross-continental, federated e-infrastructure and virtual research environment for global cooperation and open science using harmonized policies, interoperable protocols and transparent services. Network connectivity, secure AAI (Authentication and Authorization Infrastructure), computing federation, FAIR data, and policy alignment are the key components….
While the GORC initiative focuses on a roadmap for commons integration, the GOSC is creating a cooperation mechanism and testbed implementations for science clouds that arise from that roadmap. Developing and sustaining collaboration between GORC and GOSC, through the Data Together partnership will enhance the impact of each initiative and result in sustainable benefits for the wider research community. In addition, members of the Data Together group are working with the various platforms to convene a roundtable of senior representatives from the organizations to facilitate these efforts.”
Abstract: This paper introduces the Archives Unleashed Cloud, a web-based interface for working with web archives at scale. Current access paradigms, largely driven by the scope and scale of web archives, generally involve using the command line and writing code. This access gap means that subject-matter experts, as opposed to developers and programmers, have few options to directly work with web archives beyond the page-by-page paradigm of the Wayback Machine. Drawing on first-hand research and analysis of how scholars use web archives, we present the interface design and underpinning architecture of the Archives Unleashed Cloud. We also discuss the sustainability implications of providing a cloud-based service for researchers to analyze their collections at scale.
“Big bibliographic datasets hold promise for revolutionizing the scientific enterprise when combined with state-of-the-science computational capabilities. Yet, hosting proprietary and open big bibliographic datasets poses significant difficulties for libraries, both large and small. Libraries face significant barriers to hosting such assets, including cost and expertise, which has limited their ability to provide stewardship for big datasets, and thus has hampered researchers’ access to them. What is needed is a solution to address the libraries’ and researchers’ joint needs. This article outlines the theoretical framework that underpins the Collaborative Archive and Data Research Environment project. We recommend a shared cloud-based infrastructure to address this need built on five pillars: 1) Community–a community of libraries and industry partners who support and maintain the platform and a community of researchers who use it; 2) Access–the sharing platform should be accessible and affordable to both proprietary data customers and the general public; 3) Data-Centric–the platform is optimized for efficient and high-quality bibliographic data services, satisfying diverse data needs; 4) Reproducibility–the platform should be designed to foster and encourage reproducible research; 5) Empowerment—the platform should empower researchers to perform big data analytics on the hosted datasets. In this article, we describe the many facets of the problem faced by American academic libraries and researchers wanting to work with big datasets. We propose a practical solution based on the five pillars: The Collaborative Archive and Data Research Environment. Finally, we address potential barriers to implementing this solution and strategies for overcoming them.”