Nature’s recent “news” article on Text and Data Mining was unacceptable [redacted]; I ask them to renounce licensing.

[See Update 2014-02-10 at end]

I have sent the following letter to Philip Campbell, Editor of Nature:

Dear Philip,

I am writing to you to protest against your biased reporting of Text And Data Mining in Nature News (part of Nature Publishing Group (NPG))[1] . This article, which purports to be news, is effectively an attempt by the Toll-Access Scientific, Technical & Medical Publishers (TA-STM) industry to promote publisher licences as a benefit to science. It is in the same category of market-led misinformation as Science Magazine’s analysis of flawed Open Access.

Here is the true story which I ask you to publish to redress the balance.

For years Nature and other TA-STM publishers have consistently fought to prevent Text and Data Mining (TDM) solely for their financial benefit. As an example NPG promoted “the Open Text Mining Interface” in 2006 which was designed to appear useful but actually jumbled the sentences (” while preserving any subscription model that funds the journals”).[2].

During the last few years publishers, including yourselves, have imposed draconian conditions restricting crawling and reuse far beyond copyright law. These effectively prevent legal TDM for science and this has killed open activity.  Scientists doing TDM hide their activities for fear of being prosecuted or cut off. For example Max Hauessler (cited in your “news”) spent two years trying to get permission from Elsevier [3] to mine for biological sequences. It is no surprise that he can be persuaded to give a positive comment now that he can “click through”. Heather Piwowar “negotiated” for months with Elsevier – who sent several executives to negotiate. I quote her (with permission) “I hate negotiating with publishers – the stress gives me hives [Urticaria]“.

Last year the European Commission attempted to pave the way for responsible TDM (“data analytics”) by bringing the publishing community together with librarians and scientists and open groups (Ross Mounce and I represented the Open Knowledge Foundation). “Licences 4 Europe” was a series of meetings in Brussels. To summarise, the TA-STM publishers were not prepared to cooperate effectively [4] and halfway into the proceedings most of the committee wrote:

“We write to express our serious and deep-felt concerns in regards to Working Group 4 on text and data mining (TDM).  Despite the title, it appears the research and technology communities have been presented not with a stakeholder dialogue, but a process with an already predetermined outcome – namely that additional licensing is the only solution to the problems being faced by those wishing to undertake TDM of content to which they already have lawful access. ”


The signatories came from about 40 highly responsible European Scientific and Scholarly Organizations [4] and included: The Association of European Research Libraries (LIBER), UUK, The Royal Society, SURF, The Hungarian Academy of Sciences, JISC, SPARC, Research Libraries UK., The Austrian Science Fund, and included experts on policy and intellectual property law. Despite this clear and compelling request the TA-STM publishers held their position, and the signatories later withdrew from negotiations.

This failure of cooperation was later noted by Mme Neelie Kroes, European Commissioner for Digital Agenda and Vice-President EC [5]

“And, for me, the Text and Data mining Group has also shown something very clear. We need to find better ways to cope with immense data flows. They affect so many aspects of our daily lives and professional work. As the European Council put it, big data drives innovation, improves productivity, means better quality services. And scientists in particular can use these data flows for research, even for life-saving discoveries. They need every possibility to do that.

I understand the proposed initiative here by publishers is not supported by the users. And this cannot be seen as any kind of solution without agreement from that very important group of stakeholders. Now we need to seriously consider possible legislative exceptions.”

The TA-STM publishers, NPG included, have ignored Mme Kroes. The industry continues to promote licences. Elsevier’s recent announcement is not news (save for the click-through) and although previous Elsevier contracts are often secret, I suspect the click-through forfeits even more rights than before. By your complete lack of balance in failing to report any of the Licences4Europe dissension and choosing proponents who can be expected to see click-through as an advance, you are effectively marketing the licence solution under the guise of news.

My primary concern is the unacceptability of NPG using its “news organ” for self-interested promotion of the licence solution. However I have also analyzed Elsevier’s “click-through” licence in some detail and found it directly contrary to the requirements of TDM. It is badly written and designed to stop any large scale TDM. In my blog [6]  (and several previous ones) I show that the licence prevents me legally from doing chemical TDM as it would disadvantage Elsevier’s commercial offerings in this area. I could easily end up in court. So, I suspect, could the enthusiasts from whom you got quotes – their outputs, if done responsibly, could compete with Elsevier products. My analysis is backed by Professor Charles Oppenheim an expert in scholarly publishing.

There will be a strong incentive for other TA-STM publishers, including, I suspect, NPG, to follow the Elsevier route. This will either result in a plethora of per-publisher click-through licences or a single, probably highly restrictive Elsevier-like licence, available through a publisher supported gateway.

At present therefore I am finding it hard to continue to have confidence in NPG as a responsible organization in Science evaluation and communication. This is a great pity as I have previously worked productively with you and your colleagues.  Richard van Noorden had asked if he could do a story about our new initiatives in TDM (to be announced later this month) – I can’t now regard this as impartial.

I would ask you to do the following:

  • publicly renounce the use of licences to control TDM and agree that “The right to read is the right to mine”. The Royal Society (a publisher) takes this position so surely NPG could.
  • Commission a balanced account of the Licences4Europe story from a disinterested expert and publish it in Nature.

 

In two months the UK parliament is expected to table and pass the Hargreaves recommendations for TDM,  when we will be able legally to carry this out in UK. Since my institution subscribes to a large number of NPG journals which I have the right to read I expect to start mining them, without further negotiations and without your further permission, in the near future.

 

This letter will appear on my blog. I would consider it appropriate for Nature Correspondence and I request you to publish it.

Peter

 

 

 

[1] http://www.nature.com/news/elsevier-opens-its-papers-to-text-mining-1.14659

[2] http://blogs.nature.com/nascent/2006/04/open_text_mining_interface_1.html and http://hublog.hubmed.org/archives/001345.html

[3] My submission to the UK Government IPO http://blogs.ch.cam.ac.uk/pmr/2012/03/21/my-response-to-hargreaves-on-copyright-reform-i-request-the-removal-of-contractual-restrictions-and-independent-oversight/

[4] http://www.libereurope.eu/news/licences-for-europe-a-stakeholder-dialogue-text-and-data-mining-for-scientific-research-purpose
[5] http://commentneelie.eu/speech.php?sp=SPEECH/13/917
[6] http://blogs.ch.cam.ac.uk/pmr/2014/02/07/contentmining-and-elseviers-terms-the-small-print-absolutely-prevents-responsible-science/

 

 

  • Update.
  • Richard van Noorden has tweeted that he is the sole author of the article. I accept his assertion and have removed the implication that he was involved in a marketing exercise. My other concerns about the unacceptability of a news article promoting NPGs position remain.