More on Potential Conflict of Interest with Open Data (OD) Mandates

SH: Benjamin Geer suggests [requiring OD] immediately upon publication (presumably the publication of a refereed journal article based on the data in question). But the first of the [data-] collector’s articles based on that collection or the last? How many are allowed with exclusivity? and how long?… What if [the data-collector has] gathered a lot of time-consuming data, amenable to a lot of time-consuming analysis?

BG:What if they’ve gathered enough data for a lifetime of analysis?  Should they have the right to hoard their data for the rest of their life?   Where do you draw the line?  Does it make any difference, ethically, whether they collected that data using public funds?

It’s not for me (or anyone) to draw the line uniformly, a-priori. The length of time researchers may need to embargo access to the data they have gathered is something that depends on the field and data, and hence OD needs to be negotiated with the funder, possibly on a case by case basis.

This is notably not the case with OA to published research, in which, without exception, research, researchers, their funders and their institutions all benefit most from OA being provided immediately upon acceptance for publication (and the only conflict of interest is with a 3rd-party service-provider: the publisher).

Benjamin Geer proposes, simply, that research data should be made OD immediately upon publication. I am pointing out the genuine complications that this is failing to take into account. I am not at all suggesting that OD, as soon as possible, is not a good and desirable thing. It is simply far from being as straightforward as OA, especially insofar as mandating (i.e., requiring) is concerned, because there is no conflict with the researcher’s interest in the case of OA, whereas there may well be considerable conflict with the researcher’s interest in the case of OD. And it is all about timing.

As a consequence, it is very important to keep OA and OD separate, especially as regards mandates. Because of the conflict of interest, this is not a matter to be settled by a-priori ideology or edict, but by realism, fairness and pragmatics.

(By way of an indication that I am fully cognizant of (and opposed to) authors sitting unnecessarily long on their database, there was in my own field a case in which a team of researchers had been funded to collect data worldwide for a global color perception database. There was considerable controversy and consternation in the field after the data-gathering because of delays in publication and release. Many researchers in the field felt that the delays in both had slowed rather than advanced research progress. Here was a case where an advance negotiation between the funders and the researchers on the permissible length of the access embargo would have been helpful, would probably have speeded the research, and would probably have resulted in greater research progress. But the punchline from such cases is certainly not that for all data the embargo should therefore be of length zero, either between data of collection and date of publication or between data of publication and date of data-release as OD. The punchline is that OD parameters need to be negotiated in advance, on a case by case basis, with an emphasis on publication as well as release as soon as fair and practicable. There is nothing like this with OA.)

In summary, unlike the case of open access to refereed research articles, the case of open access to data, like the case of open access to books, is not an open and shut one. OD mandates are desirable, and justifiable, but their parameters will have to be negotiated field by field, case by case. And the terrain will be much better prepared for the more complicated case of mandating OD once we have successfully reached the simpler (and more urgent) goal of universally mandating OA.

Stevan Harnad American Scientist Open Access Forum