In 2020, the French Ministry of Higher Education and Research (MESR) launched the Translations and Open Science project with the aim to explore the opportunities offered by translation technologies to foster multilingualism in scholarly communication and thus help to remove language barriers according to Open Science principles.
During the initial phase of the project (2020), a first working group, made up of experts in natural language processing and translation, published a report suggesting recommendations and avenues for experimentation with a view to establishing a scientific translation service combining relevant technologies, resources and human skills.
Once developed, the scientific translation service is intended to:
address the needs of different users, including researchers (authors and readers), readers outside the academic community, publishers of scientific texts, dissemination platforms or open archives;
combine specialised language technologies and human skills, in particular adapted machine translation engines and in-domain language resources to support the translation process;
be founded on the principles of open science, hence based on open-source software as well as shareable resources, and used to produce open access translations.
In order to follow up on recommendations and lay the foundation of the translation service, the OPERAS Research Infrastructure was commissioned by the MESR to coordinate a series of preparatory studies in the following areas:
Mapping and collection of scientific bilingual corpora: identifying and defining the conditions for collecting and preparing corpora of bilingual scientific texts which will serve as training dataset for specialised translation engines, source data for terminology extraction, and translation memory creation.
Use case study for a technology-based scientific translation service: drafting an overview of the current translation practices in scholarly communication and defining the use cases of a technology-based scientific translation service (associated features, expected quality, editorial and technical workflows, and involved human experts).
Machine translation evaluation in the context of scholarly communication: evaluating a set of translation engines to translate specialised texts.
Roadmap and budget projections: making budget projections to anticipate the costs to develop and run the service.
The four preparatory studies are planned during a one-year period as of September 2022.
The present call for tenders only covers the (3) Machine translation evaluation in the context of scholarly communication.