Is Figshare Open? “it is not just about open or closed, it is about control”

I have been meaning to write on this theme for some time, and more generally on the increasing influence of DigitalScience’s growing influence in parts of the academic infrastructure. This post is sparked by a twitter exchange (follow backwards from ) in the last few hours, which addresses the question of whether “Figshare is Open”.

This is not an easy question and I will try to be objective. First let me say – as I have said in public – that I have huge respect and admiration for how Mark Hahnel created Figshare while a PhD student. It’s a great idea and I am delighted – in the abstract – that it gained some much traction so rapidly.

Mark and I have discussed issues of Figshare on more than one occasion and he’s done me the honour of creating a “Peter Murray-Rust” slide ( ) where he addresses some (but not all) of my concerns about Figshare after its “acquisition” by Macmillan Digital Science (I use this term, although there are rumours of a demerger or merger). I use “acquisition” because I have no knowledge of the formal position of Figshare as a legal entity (I assume it *is* one? Figshare FAQs ) and that’s one of the questions to be addressed here.

From the FAQs:

figshare is an independent body that receives support from Digital Science. “Digital Science’s relationship with figshare represents the first of its kind in the company’s history: a community based, open science project that will retain its autonomy whilst receiving support from the division.”

However lists Figshare among “our products” and brands it as if it is a DigitalScience division or company. Figshare appears to have no corporate address other than Macmillan and I assume trades through them.

So this post has been catalysed by a tweet of a report from a DS employee(?) Dan Valen

John Hammersley @DrHammersley tweeted:
Such a key message: “APIs are essential (for #opendata and #openscience)” – Dan Valen of @figshare at #shakingitup15

This generated a twitter exchange about why APIs were/not essential. I shan’t explore that in detail, but my primary point is that:

If the only access to data is through a controlled API, then the data as a a whole cannot be open , regardless of the openness of individual components.

There is no doubt that some traditional publishers see APIs as a way of enforcing control over the user community. Readers will remember that I had a robust discussion with Gemma Hirsh of Elsevier, who stated that I could not legally mine Elsevier’s data without going through their API. She was wrong, categorically wrong, but it was clear that she and Elsevier saw, and probably still see, APIs as a control mechanism. Note that Elsevier’s Mendeley never exposed their whole data – only an API.

An API is the software contract with a webserver offering a defined service. It is often accompanied with a legal contract for the user (with some reciprocity). The definition of that service is completely in the hands of the provider. The control of that service is entirely in the hands of the provider. This leads to the following technical possibilities:

  • control: The provider can decide what to offer , when, to whom, on what basis. They can vary this by date, geography or IP of user, and I have no doubt that many publishers do exactly this. In particular, there is no guarantee that the user is able to see the whole data and no guarantee that it is not modified in some way from the “original”. This is not, per se, reprehensible but it is a strong technical likelihood.
  • monitoring: (“snooping”) The provider can monitor all traffic coming in from IP addresses, dwell times, number of revisits, quite apart from any cached information. I believe that a smart webserver, when coupled to other data about individuals, can deduce who the user is, where they are calling from and, with the sale of information between companies, what they have been doing elsewhere.

By default companies will do both of these. They could lead to increased revenue (e.g. Figshare could sell user data to other organizations) and increased lockin of users. Because Figshare is one of several Digital Science products (DS words, not mine) they could know about a user’s publication record, their altmetric activity, what manuscripts they are writing, what they have submitted to the REF, what they are reading in their browser, etc. I am not asserting this is happening but I have no evidence it is not.

Mark says, in his slides,

“it is not just about open or closed, it is about control”

and I agree. But for me the question is who controls Figshare? and is Figshare controlling us?

Figshare appears to be one of the less transparent organizations I have encountered. I cannot find a corporate structure, and the companies’ address is:

C/o Macmillan Publishers Limited, Brunel Road, Basingstoke, Hampshire, RG21 6XS

I can’t find a board of directors or any advisory or governing board. So in practice Figshare is legally responsible to no-one other than UK corporate law.

You may think I am being unfair to an excellent (and I agree it’s excellent) service. But history inexorably shows that these beginnings become closed, mutating into commercial control and confidentiality. Let’s say Mark moves on? Who runs Figshare then? Or Springer buys Digital Science? What contract has Mark signed with DS? Maybe it binds Figshare to being completely run by the purchaser?

I have additional concerns about the growing influence of DigitalScience products, especially such as ReadCube, which amplify the potential for “snoop and control” – I’ll leave those to another blogpost.

Mark has been good enough to answer some of my original concerns, so here are some othe’r to which I think an “open” (“community-based”) organization should be able to provide answers.

  • who owns Figshare?
  • who runs Figshare?
  • Is there any governance process from outside Macmillan/DS? An advisory board?
  • How tightly bound is Figshare into Macmillan/DS? Could Figshare walk away tomorrow?
  • What could and what would happen to Figshare if Mark Hahnel left?
  • What could and what would happen to Figshare if either/both of Macmillan / DS were acquired?
  • Where are the company accounts for the last trading year?
  • how, in practice, is Figshare a “a community based, open science project that will retain its autonomy whilst receiving support from the (DS) division.”?

I very much hope that the answers will allay any concerns I may have had.



In my own opinion there have been four main reasons for the exceedingly slow growth of OA (far, far slower than it could have been) ? (1) author inertia and needless copyright worries, (2) publisher resistance via lobbying and OA embargoes, (3) premature and needless fixation on Gold OA publishing and (4) premature and needless fixation on Libre OA (re-use rights, CC-BY).

By far the most urgent and yet fully and immediately reachable objective has always been free online access to refereed journal articles (?Gratis OA?), which could long ago have been provided by authors as Green OA (exactly as computer scientists spontaneously began doing in the 1980s with anonymous ftp archiving, and physicists began doing in the 1990s with XXX (then Arxiv).

Instead, authors in most other fields have proved extremely sluggish ? because of (1), and eventually also (2) — and the public campaign for OA became needlessly and counterproductively focussed on Gold OA and Libre OA, which were neither as urgently needed as Gratis OA, nor could they be as easily provided as Gratis OA.

OA mandates by funders and institutions then began to be recommended and adopted, but these too have been exceedingly slow in coming, and needlessly weak, having gotten needlessly wrapped up in Libre and Gold OA, even though Gratis Green OA is the easiest, most effective and most natural thing to mandate.

And the irony is that this premature and needless fixation on Libre and Gold OA (which still persists) has not only helped slow the progress of Gratis Green OA, but it has also slowed its very own progress.

Because the fastest and surest way to Libre, Fair-Gold OA is to first mandate Gratis Green OA — which, once it is being universally provided, will usher in Libre, Fair-Gold quickly and naturally. This is evident to anyone who simply thinks it through.

Instead, we now continue to be bogged down in (1) – (4), with many weak and wishy-washy OA policies, Fools? Gold (as well as predatory junk Gold OA) (3) from publishers clouding the landscape, and an almost superstitious obsession with a Libre OA (2) that most research and researchers don?t need anywhere near as urgently as they need Gratis OA itself.

Meanwhile, hardly noticed, is the fact that mandates could be incomparably stronger and more effective if they simply focussed on requiring Green Gratis OA, in institutional (not institution-external) repositories, where institutions can monitor and ensure compliance by designating immediate-deposit as the sole mechanism for submitting publications for research evaluation (as Liege and HEFCE have done) and implementing the copy-request Button as the antidote against publisher OA embargoes.

In yet another effort to try to get mandates on the fast track ? requiring Gratis Green OA ? we have now analyzed the few existing OA policies? effectiveness to identify which conditions maximize compliance, in the hope that the research community can at last be persuaded to adopt evidence-based policies instead of ideology-driven ones:

Vincent-Lamarre, Philippe, Boivin, Jade, Gargouri, Yassine, Larivière, Vincent and Harnad, Stevan (2015) Estimating Open Access Mandate Effectiveness: I. The MELIBEA Score.

Swan, Alma; Gargouri, Yassine; Hunt, Megan; & Harnad, Stevan (2015) Open Access Policy: Numbers, Analysis, Effectiveness. Pasteur4OA Workpackage 3 Report.

Here is a quick little history of OA, particularly highlighting Southampton?s contribution:

Carr, L., Swan, A. and Harnad, S. (2011) Creating and Curating the Cognitive Commons: Southampton?s Contribution. In: Curating the European University

