Why some researchers oppose unrestricted sharing of coronavirus genome data

“Global-south scientists say that an open-access movement led by wealthy nations deprives them of credit and undermines their efforts….

But a growing faction of scientists, mostly from wealthy nations, argues that sequences should be shared on databases with no gatekeeping at all. They say this would allow huge analyses combining hundreds of thousands of genomes from different databases to flow seamlessly, and therefore deliver results more rapidly.

The debate has caught the attention of the US National Institutes of Health (NIH) — which runs its own genome repository, called GenBank — and the Bill & Melinda Gates Foundation, which has considered encouraging grantees to share on sites without such strong protections, Nature has learnt.

But many researchers — particularly those in resource-limited countries — are pushing back. They tell Nature that they see potential for exploitation in this no-strings-attached approach — and that GISAID’s gatekeeping is one of its biggest attractions because it ensures that users who analyse sequences from GISAID acknowledge those who deposited them. The database also requests that users seek to collaborate with the depositors….

Fears of inequitable data use are amplified by the fact that only 0.3% of COVID-19 vaccines have gone to low-income countries. “Imagine Africans working so hard to contribute to a database that’s used to make or update vaccines, and then we don’t get access to the vaccines,” says Christian Happi, a microbiologist at the African Centre of Excellence for Genomics of Infectious Diseases in Ede, Nigeria. “It’s very demoralizing.” …”

Critics decry access, transparency issues with key trove of coronavirus sequences | Science | AAAS

“In December 2020, software developer Angie Hinrichs at the University of California, Santa Cruz (UCSC), applied for access to a labor-saving data feed from GISAID, a nonprofit database of viral sequences including those of the pandemic coronavirus, SARS-CoV-2. She wanted GISAID’s data so she could display mutations on UCSC’s coronavirus Genome Browser. That tool ties any position in the virus’ nearly 30,000-letter genome to other scientific information, much as Google Maps shows gas stations and restaurants near addresses.

With more than 700,000 genomes from more than 160 countries, GISAID is by far the world’s largest database of SARS-CoV-2 sequences. Access to the free, nonprofit repository has become vital to Hinrichs and thousands of other scientists and public health agencies tracking the virus’ alarmingly rapid evolution.

But instead of getting a direct data feed, Hinrichs lost her existing access to two conveniently packaged GISAID files that are the next best thing. She emailed GISAID repeatedly pleading for restored access, but hasn’t gotten it. Since December, she has had to download GISAID’s sequences 10,000 at a time, with no access to most of the metadata unless she looks at each of the 10,000 sequences individually. …

But critics complain about GISAID’s constraints on access, chief among them its prohibition on resharing of its data. Its agreement for access to the direct data feed also requires applicants to use only GISAID data in their websites and tools, as well as only GISAID-approved strain names. (GISAID says allowing users to mix data on their websites “would duplicate data already in GISAID, resulting in bias and distorted results.”)….”

Critics decry access, transparency issues with key trove of coronavirus sequences | Science | AAAS

“In December 2020, software developer Angie Hinrichs at the University of California, Santa Cruz (UCSC), applied for access to a labor-saving data feed from GISAID, a nonprofit database of viral sequences including those of the pandemic coronavirus, SARS-CoV-2. She wanted GISAID’s data so she could display mutations on UCSC’s coronavirus Genome Browser. That tool ties any position in the virus’ nearly 30,000-letter genome to other scientific information, much as Google Maps shows gas stations and restaurants near addresses.

With more than 700,000 genomes from more than 160 countries, GISAID is by far the world’s largest database of SARS-CoV-2 sequences. Access to the free, nonprofit repository has become vital to Hinrichs and thousands of other scientists and public health agencies tracking the virus’ alarmingly rapid evolution.

But instead of getting a direct data feed, Hinrichs lost her existing access to two conveniently packaged GISAID files that are the next best thing. She emailed GISAID repeatedly pleading for restored access, but hasn’t gotten it. Since December, she has had to download GISAID’s sequences 10,000 at a time, with no access to most of the metadata unless she looks at each of the 10,000 sequences individually. …

But critics complain about GISAID’s constraints on access, chief among them its prohibition on resharing of its data. Its agreement for access to the direct data feed also requires applicants to use only GISAID data in their websites and tools, as well as only GISAID-approved strain names. (GISAID says allowing users to mix data on their websites “would duplicate data already in GISAID, resulting in bias and distorted results.”)….”