Cameron Neylon at Semantic Physical Science; Software philosophy, why the RWA is wrong, and how we change the publishing market


Cameron Neylon gave one of the first talks at Semantic Physical Science. No slides, just analysis and passion.

Cameron, like me, knows that Semantic science depends on people as well as technology and so he dwelt a lot on culture and practices rather than details. The talk was split into two parts. I have timelined this (roughly, there are no subtitles/breaks).

Here is the link to the video:

The first section makes the case that good modern science must adopt quality control of the sort regularly practised in good software groups (common in industry, but not academia). Cameron is launching a new journal (Open Research Computing) to allow this approach to be published and thereby to give its practitioners formal value. Too much science has poor controls, poor aims, is not reusable and Cameron argues that we can learn these values from software engineering.

0:30 unit testing, criteria for judging science

1:30 continuous integration and test-driven design

2:00 good software practice helps to think about managing scientific process; better architectures for science in a broader term.

2:35 Creation of trained workforce, balance between training and research especially for graduates

3:30 good software is good mentality for good infrastructure for research

4:15 mustn’t create resources which aren’t used by anyone

4:30 computational experiments often don’t work

5:20 a unit test is just a control

5:50 publication and continuous integration

7:00 software can push quality into science

7:30 must be serious about replicating experiments

7:50 impact factor correlates with retractions

8:20 incentive not properly structured

9:00 new journal Open Research Computing

10:00 software methods papers

10:50 if 5% of papers are software get high impact

The second half critiques the Research Works Act. Cameron has shown that publishers add NO VALUE to his last 10 papers. As his blog (above) says:

Results: The contribution of IP by publishers to the final submitted versions of these ten papers, after peer review had been completed, was zero. Zip. Nada. Zilch. Not one single word, line, or graphical element was contributed by the publisher or the editor acting as their agent. A small number of single words, or forms of expression, were found that were contributed by external peer reviewers. However as these peer reviewers do not sign over copyright to the publisher and are not paid this contribution cannot be considered work for hire and any copyright resides with the original reviewers.


11:25 Research Works Act would wipe out NIH mandate and set us back years

12:00 Agument is that publishers contribute to quality of science

12:30 what do publishers contribute to Cameron?

12:50 ZERO (Zip Nada Zilch)

13:30 Publishers do provide a service

13:50 publishing not zero-cost

14:10 Old paper model is obsolete; all costs in generating first copy

14:50 publishing model is really bad way to run business

15:10 must address service costs

15:40 Transitional period required; RWA turns clock back

16:15 how to build service market

16:45 Must turn publishing round and address service

17:10 Software is a service. How do we configure the market?

18:30 Software gives us experience and clues

Research Works Act: Who Needs Open Access To What? And Why?

Sandy Thatcher [ST] asked (off-line):

ST: “Can you give me any numbers for scientists within the U.S. who do not have access to the professional literature they need through either their institutional affiliations or services like DeepDyve?”

I can give you the following pieces of indirect evidence:

(1) The (now-out-of-date) ARL stats on institutional serials holdings. The estimate is that the articles in the serials that the institution cannot afford to subscribe to or license are inaccessible to the users at that institution.

(2) The data on the OA advantage in downloads and citations (indicators of what is being lost if access is restricted to subscribers only).

More direct evidence can only come from polling researchers or monitoring their web activity automatically, to see how many times they click on articles and are stopped by a pay-wall (including a DeepDyve pay-wall).

ST: “(I do not believe it is the responsibility of the U.S. government to provide research to everyone in the world.)”

Perhaps not (though US researchers do not conduct research only intended to be used, applied and built upon by US researchers; and lost research uptake and progress is a loss for all researchers and research) — but the ARL stats show that no US institution can afford access to all or most journals, and many can only afford only a small fragment.

That’s all lost research impact and progress.

ST: “How much of the literature they need is not already being provided through Green OA repositories?”

At least 80% is not provided — except if deposit is mandated, in which case less than 20% is not provided.

But mandates is what this is all about…

ST: “Do you have an answer to the posting made by Danny Jones yesterday?”

I wasn’t going to reply to that posting, which essentially says “mandated OA deposit of articles is too much of a burden: just make progress reports and final reports OA instead” — but since you ask:

From: Danny Jones [DJ] on Liblicense-l
Date: Tue, 17 Jan 2012 21:46:43 -0600

DJ: “I am very interested in seeing (specifically in my case) NIH-funded final reports made publicly available.”

Fine, and welcome. But, as I said, no substitute for access to refereed journal articles, for researchers.

DJ: “I recommend going a step further to require the annual progress reports to be also made publicly available along with data collected with federally funded grants.”

Both welcome (though making the data public raises some sticky issues about the researcher’s right to mine his own data first).

But, as I said, no substitute for access to refereed journal articles, for researchers.

DJ: “Before I retired on January 6, 2012 as director of the library at Texas Biomedical Research Institute, I was responsible for monitoring compliance with the NIH Public Access Policy by our mostly NIH-funded investigators. TxBiomed scientists are generally supportive of the policy, but it isn’t always easy to be in compliance [with the NIH Public Access Policy] for a variety of reasons.”

Indeed, because the policy is non-optimal, for a number of reasons. See: “Public Access to Federally Funded Research (Harnad Response to US OSTP RFI)

The main bugs are (1) central deposit instead of institutional deposit (and central harvesting) and (2) the publisher deposit option.

DJ: “And complying represents an added regulatory burden for investigators who often have moved on to other investigations when an article finally gets published.”

The keystrokes to publish-or-perish are the burden. The few extra keystrokes to deposit the final draft are a piece of cake — it just has to be made part of the author’s routine work-flow. Incomparably tinier than doing the progress reports or final report.

DJ: “NIH grants may require several years of work before a final report is submitted, and during this time investigators may publish articles reporting results of their funded investigations, which results will also be included in their annual progress reports.”

More important, the articles will be accessible to subscribers as soon as they are published. OA is about making sure they are accessible to nonsubscribers too.

DJ: “Waiting for final reports to be submitted to NIH may actually delay access to NIH-funded research results,”

Not for those who have access to the published articles.

And, as I said, neither progress reports nor final reports can substitute for access to refereed journal articles, for researchers.

DJ: “As these reports are required by NIH already, it does not represent an added burden to investigators (they are already doing it), and the burden rests directly where it should be, with the funded investigator.”

If depositing a report is no added burden, depositing an article isn’t either. (It just needs to be mandated, like publish-or-perish, progress reports, and final reports.)

DJ: “with the NIH Public Access Policy, final approval of manuscripts deposited into the NIH Manuscript Submission System is the responsibility of the corresponding author, who is not necessarily the NIH-funded author.”

All co-authors see final drafts of their articles: The fundees should deposit that.

DJ: “The NIH Public Access Policy should be repealed in my opinion. It is an unnecessary added burden for NIH-funded authors and compliance is not as simple as some suggest it is.”

As great a burden as publish-or-perish, progress reports, or final reports? (Should those be repealed too?)

DJ: “And the punitive nature in which investigators are required to comply by threat of consideration against future funding from NIH does not result in great enthusiasm for government regulations.”

Why is this “punitive” with article deposit and not punitive with publish-or-perish, progress reports, or final reports?

DJ: “The progress reports and the final reports are already part of the established responsibility of NIH-funded investigators, and making them publicly available will provide the public with full information about the research that the government is paying for.”

If the “extra burden” argument had been valid, neither publish-or-perish, progress reports, nor final reports would have been part of researchers’ established work-flow.

And, no, progress reports and final reports are not what the government is paying for: the refereed research articles are.

And there is no longer any reason whatsoever in the online era for restricting access to refereed research only to users at institutions that can afford to subscribe to the journal in which they are published.

That’s not what research is funded for.

DJ: “While this approach does not address the contents of published journal articles, having access to the investigators’ reports of federally funded research may in fact eliminate the need for access to journal articles that acknowledge federally funded research grants.”

Substitute grant final reports for refereed research articles?

DJ: “Finally, not only should the reports be publicly available, but all data generated as a result of federal funding should also be publicly available.”

Easier said than done (because of the first-exploitation rights problem).

But wouldn’t it be too much of a burden for the poor researcher…? ;>)

Stevan Harnad

BibSoup! A new OPEN approach to managing personal and group bibliographies

We have just finished 3 hectic days of “sprint” (design, coding, documentation, testing, deployment) on the JISC/OKF Openbiblio2 project in Cambridge. This is an Open international project, with major input from Jim Pitman in Berkeley, and offered to anyone interested in collaborating and benefitting. Before the word “bibliography” makes you switch off, don’t! EVERYONE needs bibliography. Here are some examples which show how universal bibliography is:

  • Your publication list
  • Your reading list
  • A list of your software and the software you use
  • A list of your datasets and the datasets you use
  • A catalogue of the books you possess

The overall concept is “BibSoup” – a novel approach (some would call it Web 3.0) based on complete Openness of code, content and most importantly attitude. It’s based on meritocracy rather than central control. YOU control your own bibliography – a pot of BibSoup. It can be as perfect or imperfect as you like – BibSoup doesn’t mind. You don’t have to have all the information for a book (other people do that). You don’t have to have the author’s full name. If you don’t understand the difference between works, manifestations and expressions don’t worry.

The basis of BibSoup is that you build your own bibliography for your own purposes using the BibSoup technology and software. You don’t have to understand it – it’s easy to use. It consists of a server (Bibserver) which is easy to clone and deploy to hold your data. Bibserver uses JSON (“Jason”) as a transfer format (BibJSON).

You could run Bibserver on your own laptop for your own purposes (e.g. browsing all those articles).

You could run Bibserver on your website to tell the world about your Open collection and to share it with others. This is the most novel feature of BibSoup. By sharing your collection you’ll find people who are also interested in the same things. Maybe you’ll find that your annotations are valuable to others and vice versa. Maybe you’ll want to set up a group where you pool your references. But none of this is mandatory.


As Mark MacGillivray puts it

we are not trying to re-do what is already available online, we are not getting into the detail of normalisation or disambiguation within a centralised database, and we are not intending to alter the academic culture overnight; however, we are going to improve the BibJSON facility for wider use, we are trying to determine how we can get more small groups and individuals involved, and we are identifying compelling, essential and simple reasons for people to support the project at this early stage before the ultimate global benefits can be realised.

To reiterate. We are NOT compiling the one true bibliographic collection and competing with Open Library, Mendeley, Microsoft Academic Search, Google Scholar, Symplectic and other semi-open/Free collections of bibliography. We are NOT competing with Zotero as a reference manager. We, and our adopters, will these as valuable sources of bibliographic input. We ARE praising the virtues of completely open bibliographic collections such as the British National Bibliography. Our adopters may wish to use BibSoup as a way of cleaning up some collections of bibliography (references).

We believe that a completely Open ecology of bibliography will lead to communal contributions which rapidly enhance BibSoup (because Open projects belong to YOU). We want to encourage creations of bibliography (e.g. publication and reading lists) in a rapid and Open manner.

We’ve set up a series of resources:

Some bibliographies (just open them and browse – the “visualise” is fun if there are some high frequency components (author, journal) – facets in top-left corner).

And some videos (apologies for some truncation and quality – we are getting a better site soon)

Bibserver is EASY to use. Just login ( ) and upload your collection (Don’t try millions of records – suggest you contact us if you have more than 10,000). No software to install (although it’s all open and you can run it privately). Browse your collection anywhere on the web with any browser. And we are committed to making all these collections Easily Openly downloadable – there are no walled gardens ( ) in BibSoup.

We can ingest BibTeX, RIS and some other common formats at present. Parsers are easy to write, so join the project if yours isn’t there. BibJSON is as easy or complex as you want to make it. You can use the Open Openbiblio Bibserver or you can clone the code and run your own privately.

PLoS ONE News and Blog Round-Up

This month in PLoS ONE:  Internet addiction, the world’s smallest vertebrate, zombie bees and more!

Chinese researchers scanned the brains of 17 young individuals with clinical internet addiction disorder (IAD) and found that these web addicts had diminished brain volume in certain areas, most notably white matter.  These brain changes are similar to those hooked on other drugs such as heroin or alcohol. ABC News, BBC News, and Forbes covered this article.

At an average body size of 7.7 mm, one team of scientists working in New Guinea believes to have discovered the world’s smallest vertebrate.  These frogs, scientifically named, Paedophryne amauensis, live in the moist leaf litter on floors of tropical wet-forests, and two of them can fit comfortably on your thumbnail or a dimeThis article was covered by FOX News, CNN, and Scientific American.

“Zombie” bees in the San Francisco bay area have been leaving their hives, walking around in circles with no apparent sense of direction, and collapsing dead to the ground.  These symptoms imitate colony collapse disorder, (CCD) where honey bees inexplicably disappear from their colony.  For several years, the US honey bee population has been declining, and researchers from San Francisco State University found that a parasitic fly, Apocephalus borealis, may be responsible for CCD in Northern California.  The fly is a known parasite in bumble bees but the scientists used genetic analysis to confirm the parasite in the honey bees and bumble bees was the same species.  This article was covered by NPR, Nature, and USA TODAY. The image above is courtesy Christopher Quock and can be found in the manuscript.

A new study finds that men and women have very different personality traits using personality measurements from more than 10,000 people, approximately half men and half women.  The researchers of the article believe that the extent of sex differences in human personality have been underestimated because most previous researchers have focused on one trait at a time and because they failed to correct for measurement error.  MSNBC, Times of India, and FOX News covered this article.

Why do dung beetles dance?  Scientists reveal that dances are elicited when the dung beetles lose control of their ball or lose contact with it altogether.  However, for the most part, the beetles manage to roll their ball in a near perfect straight line using polarized light.  This article was covered by Scientific American, National Geographic, and Live Science.

For more in-depth coverage on news and blog articles about PLoS ONE papers, please visit our Media Tracking Project.

Open Update

I am snowed under with things that I have to do and want to do. I therefore cannot give as much attention to others, so I will comment briefly on them below

My main focus now is on these areas (all of which require several blog posts)

So some others which I would normally devote whole posts to:

  • Congratulations to Alma Swan on her appointment of Director of Open Advocacy ( ). I have had the privilege to work with Alma for several years and can testify to her single-minded commitment to Open Access. She has made a major contribution in adding unchallengeable metrics that show that Open Access increases the value of scholarships through, for example, increased citations. I quote Alma: “I’m delighted to be taking on this new role,” said Swan, “Policymakers are increasingly interested in hearing the arguments. Presenting the evidence-based case to them will help to bring about the policy developments we all want to see.”

    Alma also created a very beautiful calligraphic calendar of Open Access – something that still brings inspiration when I look at it.

  • Springer announces change for its author-paid hybrid Open Access from CC-NC to CC-BY. I am VERY pleased by this and congratulate Springer. (I knew about this earlier when I wrote to Springer and agreed to abide by the embargo). This brings Springer’s hybrid model (Open Choice) into line with its Open Access offerings (all of which use CC-BY). Very few publishers use CC-BY, but Springer is showing the lead from the major publishers and there is every reason why the others should follow.


    CC-BY allows text-and-data mining and represents the full value that one can get from OA.


  • Richard Poynder for his frequent and very objective blogging of the current state of Open Access – see with a very comprensive daily list of those who have distanced themselves from the Research Works Act, H.R.3699

Expect a daily post on Semantic Physical Science (may depend on my weekly Vimeo quota) although Charlotte is also mounting them at , the excellent Cambridge streaming video site. And expect slightly lower frequency for Open Biblio, and Panton discussions.


Nature Publishing Group – supports scholarship, not Research Works Act, SOPA or PIPA!

Awesome news from Nature Publishing Group – NPG does not support the anti-open access Research Works Act, SOPA or PIPA.

Among the traditional scholarly publishers, NPG has been an early leader in supporting open access – and standing up for scholarship against the inappropriate tactics of anti-openaccess lobbyists.  In 2007, it was Jim Giles’ article in Nature that exposed the hiring of PR pitbull Eric Dezenhall and his bizarre strategies such as linking open access with government censorship, and NPG was among the first to disavow support for the ludicrous, quickly doomed PRISM anti-OA coalition attempt.

NPG has also been an early leader in supporting NPG authors’ desires for open access, such as actively encouraging author self-archiving and being among the first to begin to compete in the open access environment. Following is a list of links to previous posts about NPG on The Imaginary Journal of Poetic Economics. Kudos and thanks to NPG for being a stellar example of how a long-time traditional publisher can approach the process of transitioning to open access.

Opposition to open access continues, while anti-OA coalitionattempt implodes
We all owe a debt of thanks to Nature and Jim Giles (and tothose who leaked the documents) for releasing the story on the AmericanAssociation of Publishers’ hiring of PR pitbull Eric Dezenhall, who recommendedbizarre strategies such as linking open access with government censorship andjunk science, strategies which have been reflected in OA opposition efforts,including PRISM. The latest on this can be found on OpenAccess News.

Nature Publishing Group and Scientific Reports: gettingserious about OA competition
Kudos: Nature self-archiving on behalf of authors
NEJM and Nature evolving toward open access
Thanks to NPG's Grace Baynes for the links to NPG statements on the Research Works Act, SOPA, and PIPA.

Genetic Signatures of Exceptional Longevity Revisited

Today we published a paper titled “Genetic Signatures of Exceptional Longevity in Humans,” by lead researchers Paola Sebastiani and Thomas Perls of Boston University, which identifies genetic variants associated with exceptional longevity.

This paper is based on work originally reported in the journal Science in July 2010. The authors voluntarily retracted the Science paper in July 2011 due to various technical concerns, as detailed in the retraction notice:

After online publication of our report ‘Genetic Signatures of Exceptional Longevity in Humans’ (1) we discovered that technical errors in the Illumina 610 array and an inadequate quality control protocol introduced false positive single nucleotide polymorphisms (SNPs) in our findings. An independent laboratory subsequently performed stringent quality control measures, ambiguous SNPs were then removed, and resultant genotype data were validated using an independent platform. We then reanalyzed the reduced data set using the same methodology as in the published paper. We feel the main scientific findings remain supported by the available data: (i) A model consisting of multiple specific SNPs accurately differentiates between centenarians and controls; (ii) genetic profiles cluster into specific signatures; and (iii) signatures are associated with ages of onset of specific age-related diseases and subjects with the oldest ages. However, the specific details of the new analysis change substantially from those originally published online to the point of becoming a new report. Therefore, we retract the original manuscript and will pursue alternative publication of the new findings.

The paper published today is the corrected and peer reviewed version of their findings, with additional authors who independently validated the data and methodology, as well as an additional sample of centenarians used for replication purposes. As stated in the retraction notice, the primary findings remain the same, but the SNPs incorrectly identified in the original study have been removed from the model for predicting longevity.

While we recognize that aspects of this study will attract attention owing to the history and the strong claims made in the paper, the handling editor, Greg Gibson, made the decision that publication is warranted, balancing the extensive peer review and the spirit of PLoS ONE to allow important new results and approaches to be available to the scientific community so long as scientific standards have been met.  We trust that publication will facilitate full evaluation of the study.

1. Sebastiani P, Solovieff N, Puca A, Hartley SW, Melista E, et al. Genetic Signatures of Exceptional Longevity in Humans. Science 10.1126/science.1190532 (2010).

Congratulations Michael Nielsen, SPARC innovator

I am delighted to congratulate Michael on his award as a SPARC innovator: Here are some bits of the citation (below) but my personal comments first.

Michael is a true 21st century scientist. There aren’t yet many around, and academia stifles their growth. They change the rules of the organization, the community, the values. So, almost by definition, they have to work outside the system. Michael was/is a quantum physicist with an enviable track record but he has given this up to explore the 21C. We’ve met 2-3 times last year and he’s currently (I think) doing lecture tours having written his seminal bestseller Reinventing Discovery If you get a chance, try to get to a lecture, though you can also see the TED lecture

Here’s SPARC…

Michael Nielsen

Michael Nielsen, a 37-year-old, Australian quantum physicist, just completed a 17-city tour in seven countries, doing a series of presentations to promote the open sharing of data and research to advance science. On top of that, he spent a month traveling to promote his book, Reinventing Discovery: The New Era of Networked Science (Princeton University Press, 2011). His talk of changing the culture of science has drawn audiences beyond typical academics. Nielsen’s passion, credibility as a scientist, and knack for storytelling has helped propel the issue of Open Science into the mainstream.

For being a thought leader and demonstrating how doing science in the open can promote change and bringing the discussion to a new level, SPARC honors Nielsen as the January 2012 SPARC Innovator.