“Grambank was constructed in an international collaboration between the Max Planck institutes in Leipzig and Nijmegen, the Australian National University, the University of Auckland, Harvard University, Yale University, the University of Turku, Kiel University, Uppsala University, SOAS, the Endangered Languages Documentation Programme, and over a hundred scholars from around the world. Grambank is designed to be used to investigate the global distribution of features, language universals, functional dependencies, language prehistory and interactions between language, cognition, culture and environment. The Grambank database currently covers 2,467 language varieties, capturing a wide range of grammatical phenomena in 195 features, from word order to verbal tense, nominal plurals, and many other well-studied comparative linguistic variables. Grambank’s coverage spans 215 different language families and 101 isolates from all inhabited continents. The aim is for Grambank to ultimately cover all languages for which a grammar or sketch grammar exists. Grambank is part of Glottobank, a research consortium that involves work on complementary databases of lexical data, paradigms, numerals and sound patterns in the world’s languages. Grambank can be used in concert with other databases, such as those in Glottobank and D-PLACE, to deepen our understanding of our history and communicative capabilities.”
Abstract: In their article, Marsden and Morgan-Short comprehensively review the current state and development trajectories for key areas within open research practices, both in general as well as more particularly in the context of language sciences. As the article reveals, the scope of open research practices is enormous and essentially touches upon every aspect of performing and interacting with research. The authors touch upon the lack of an established metascience within language sciences that would help inform and guide development of research practices, but, as I see it, the problem is universal, and there would be benefit in creating a stronger and more cohesive metascience discipline in general. While researchers have established practices of research, education, and dedicated scholarly communication outlets within the philosophy of science, history of science, information science, and higher education policy, metascience has remained an area where the discussion is highly distributed and appears sporadically across diverse research disciplines. As Marsden and Morgan-Short’s review demonstrates, there are a lot of open questions relating to how to move forward on a global scale in the best interest of research and researchers. A more cohesive core of metascience would aid in the creation of immediately useful knowledge.
Abstract: Open research practices are relevant to all stages of research, from conceptualization through dissemination. Here, we discuss key facets of open research, highlighting its rationales, infrastructures, behaviors, and challenges. Part I conceptualizes open research and its rationales. Part II identifies challenges such as the speed and cost of open research, the usability of open data and materials, the difficulties of conducting replication research, and the economics and sustainability of open access and open research generally. In discussing these challenges, we have sought to provide examples of good practice, describe and evaluate emerging innovations, and envision change. Part III considers ongoing coevolutions of culture, infrastructure, and behaviors and acknowledges the limitations of our review and of open research practices. We argue that open research is indeed a large part of our future, and most—if not all—challenges are surmountable, but doing so requires significant changes for many aspects of the research process.
“In their article, Marsden and Morgan-Short comprehensively review the current state and development trajectories for key areas within open research practices, both in general as well as more particularly in the context of language sciences. As the article reveals, the scope of open research practices is enormous and essentially touches upon every aspect of performing and interacting with research. The authors touch upon the lack of an established metascience within language sciences that would help inform and guide development of research practices, but, as I see it, the problem is universal, and there would be benefit in creating a stronger and more cohesive metascience discipline in general. While researchers have established practices of research, education, and dedicated scholarly communication outlets within the philosophy of science, history of science, information science, and higher education policy, metascience has remained an area where the discussion is highly distributed and appears sporadically across diverse research disciplines. As Marsden and Morgan-Short’s review demonstrates, there are a lot of open questions relating to how to move forward on a global scale in the best interest of research and researchers. A more cohesive core of metascience would aid in the creation of immediately useful knowledge….”
Abstract: While we often think of words as having a fixed meaning that we use to describe a changing world, words are also dynamic and changing. Scientific research can also be remarkably fast-moving, with new concepts or approaches rapidly gaining mind share. We examined scientific writing, both preprint and pre-publication peer-reviewed text, to identify terms that have changed and examine their use. One particular challenge that we faced was that the shift from closed to open access publishing meant that the size of available corpora changed by over an order of magnitude in the last two decades. We developed an approach to evaluate semantic shift by accounting for both intra- and inter-year variability using multiple integrated models. This analysis revealed thousands of change points in both corpora, including for terms such as ‘cas9’, ‘pandemic’, and ‘sars’. We found that the consistent change-points between pre-publication peer-reviewed and preprinted text are largely related to the COVID-19 pandemic. We also created a web app for exploration that allows users to investigate individual terms (https://greenelab.github.io/word-lapse/). To our knowledge, our research is the first to examine semantic shift in biomedical preprints and pre-publication peer-reviewed text, and provides a foundation for future work to understand how terms acquire new meanings and how peer review affects this process.
“We invited a number of (lead) editors to tell us about their journals and the reasons why they chose to work with Openjournals.nl. Sible Andringa, editor-in-chief of the Dutch Journal of Applied Linguistics, kicks off. He feels that the journal has become more attractive to authors since switching to Openjournals and he explains why his editors quit working with a traditional publisher.
Sible Andringa: ‘The journal Dutch Journal of Applied Linguistics (DuJAL) has been around for a long time. It started as the Journal of Applied Linguistics in Articles. The first volume was published in-house in 1976. From the beginning, the journal was published by the Dutch Association of Applied Linguistics Anéla (see www.anela.nl). In 2012, it was decided to change its name. The journal was renamed Dutch Journal of Applied Linguistics and it has since been published by John Benjamins. In January 2021, the journal moved to Openjournals….
With Openjournals, you can choose to offer all that together: pre- and post-prints are not necessary, and all data and instruments can be co-published. The ideal model, if you ask me. We can now also think about all kinds of new forms of publishing, such as publishing conference posters and the like. Those conversations we can now have, because we know it is possible and allowed by the publisher. We find that we have become more attractive to authors now that we are open access and publish on an ongoing basis. There are not huge numbers of submissions right away, but a steady stream of good quality.”
Abstract: Scientific studies of language span across many disciplines and provide evidence for social, cultural, cognitive, technological, and biomedical studies of human nature and behavior. By becoming increasingly empirical and quantitative, linguistics has been facing challenges and limitations of the scientific practices that pose barriers to reproducibility and replicability. One of the proposed solutions to the widely acknowledged reproducibility and replicability crisis has been the implementation of transparency practices, e.g. open access publishing, preregistrations, sharing study materials, data, and analyses, performing study replications and declaring conflicts of interest. Here, we have assessed the prevalence of these practices in randomly sampled 600 journal articles from linguistics across two time points. In line with similar studies in other disciplines, we found a moderate amount of articles published open access, but overall low rates of sharing materials, data, and protocols, no preregistrations, very few replications and low rates of conflict of interest reports. These low rates have not increased noticeably between 2008/2009 and 2018/2019, pointing to remaining barriers and slow adoption of open and reproducible research practices in linguistics. As linguistics has not yet firmly established transparency and reproducibility as guiding principles in research, we provide recommendations and solutions for facilitating the adoption of these practices.
This paper aims to provide a context for Brazilian Portuguese language documentation and its data collection to establish linguistic repositories from a sociolinguistic overview.
The main sociolinguistic projects that have generated collections of Brazilian Portuguese language data are presented.
The comparison with another situation of repositories (seed vaults) and with the accounting concept of assets is evocated to map the challenges to be overcome in proposing a standardized and professional language repository to host the collections of linguistic data arising from the reported projects and others, in the accordance with the principles of the open science movement.
Thinking about the sustainability of projects to build linguistic documentation repositories, partnerships with the information technology area, or even with private companies, could minimize problems of obsolescence and safeguarding of data, by promoting the circulation and automation of analysis through natural language processing algorithms. These planning actions may help to promote the longevity of the linguistic documentation repositories of Brazilian sociolinguistic research.
This paper reports on the first 20 years of the Open Language Archives Community (OLAC), comprehensive infrastructure for indexing and discovering language resources.
We begin with the original vision, assess progress relative to the original requirements, and identify ongoing challenges.
Based on the overview of OLAC history and recent developments and on the analysis of the situation in the language archives area as a whole, the authors propose an agenda for a more sustainable future for open language archiving.
This paper examines the progress of OLAC and discusses improvements in such areas as participation, access, and sustainability.
Abstract: On 1 September 2022, professor of linguistics and director of cOAlition S Johan Rooryck was created a doctor honoris causa at UiT The Arctic University of Norway. In this in-depth interview, Rooryck reflects on his career so far and shares his vision of a future where scholar-led, fair and equitable open access prevails over commercial publishing structures.
Johan Rooryck starts out by explaining how he became the editor-in-chief of the high-ranking journal Lingua in 1999, how his relations with the publisher Elsevier became increasingly strained, and how he succeeded in bringing all his co-editors along in a sensational break with Elsevier. Instead, they launched the fully open access journal Glossa (now a high-ranking journal of general linguistics) at the platform Open Library of Humanities, in 2015. Rooryck in particular dwells on the non-commercial model known as Diamond Open Access, with no charges facing either readers or authors. Speaking on behalf of Plan S and the cOAlition S, whose executive director he became in 2019, Rooryck also broadens the view to all forms of open access, including open access to books and research data. At the end, he looks ahead to the future, when faced with the final, fundamental question: are you an optimist?
“• Is it legal to post “postprints” online? • Depends on each publisher’s policies • We compiled a list of 60 Applied Linguistics journals (from Web of Science) • Examined their copyright policies from Sherpa Romeo (https://v2.sherpa.ac.uk/romeo/) • Publishers that permit postprints: • Cambridge, Elsevier, John Benjamins, SAGE, Emerald, De Gruyter, Akadémiai Kiadó • Publisher that permit postprints on personal websites only (embargo on repositories): • Springer, Oxford University Press, Taylor & Francis • Publishers that do NOT permit on postprints before an embargo period: • Wiley (usually 24-month embargo)…
What this Pledge is NOT asking you to do: • Does not ask you to break any laws. Sharing postprints is within your rights (see table next slide). • Does not ask you to share “preprints” but to share “postprints”. • Does not limit you to publishing in these journals. • Does not require you do anything else (like boycotting certain publishers or not reviewing for them)….”
Abstract: The past decades have seen substantial growth in digital data on the world’s languages. At the same time, the demand for cross-linguistic datasets has been increasing, as witnessed by numerous studies devoted to diverse questions on human prehistory, cultural evolution, and human cognition. Unfortunately, most published datasets lack standardization which makes their comparison difficult. Here, we present a new approach to increase the comparability of cross-linguistic lexical data. We have designed workflows for the computer-assisted lifting of datasets to Cross-Linguistic Data Formats, a collection of standards that make these datasets more Findable, Accessible, Interoperable, and Reusable (FAIR). We test the Lexibank workflow on 100 lexical datasets from which we derive an aggregated database of wordlists in unified phonetic transcriptions covering more than 2000 language varieties. We illustrate the benefits of our approach by showing how phonological and lexical features can be automatically inferred, complementing and expanding existing cross-linguistic datasets.
“Scholars from the Max Planck Institute for Evolutionary Anthropology in Germany and the University of Auckland in New Zealand have created a new global repository of linguistic data. The project is designed to facilitate new insights into the evolution of words and sounds of the languages spoken across the world today. The Lexibank database contains standardized lexical data for more than 2000 languages. It is the most extensive publicly available collection compiled so far….”
“[Q]When did you first engage with open access and why is it important for you as an academic, also considering the different roles you have in the scholarly communication system (reader, editor(-in-chief), advisory board member)?
[A] I became actively interested in open access around 2011/2012, when Timothy Gowers launched the Elsevier boycott and the Cost of Knowledge protest against Elsevier’s expensive subscriptions and journal bundling. A number of very good reviewers informed me that they would no longer review for Lingua, the Elsevier journal I had been an editor for since 1999. This was worrisome, because without access to the right reviewers, a journal cannot maintain its peer review processes. So I started to think about alternatives. In 2011, I had also met Saskia de Vries, who at that time was director of Amsterdam University Press, and who provocatively asked me if I was not interested in flipping Lingua to open access, and what would be required to do so. That conversation led to many more contacts, including Natalia Grygierszyk, director of the Radboud University Library in Nijmegen, and we jointly decided to look into possibilities to make Lingua open access….”
“Another challenge was governance: what does it mean to have a journal owned and led by scholars How does that work? How do we imagine ownership in such a way that the journal cannot be bought by a commercial entity in the future? That is a process that we laid down in the Glossa Constitution, a document that specified that the Glossa title is in the hands of the community, and represents no monetary value. Recently, we were offered 300k to sell the journal title. We made that simply impossible via this Constitution, so there cannot even be a temptation. And you can easily understand why someone would want to offer 300k for a journal like ours: we publish between 120 and 150 articles a year. If a commercial publisher were to charge 2,000 euros per article, that could mean a gross income of 300k per year. Deduct costs of about 500 euros per article for production and manuscript handling, and you are left with a tidy profit of 225k. …”