Revealing Dialogue on “CHORUS” with David Wojick, OSTI Consultant

1.0 Tripping Point: Delayed Access is not Open Access;
“Chorus” is a Trojan Horse

Note: David Wojick works part time as the Senior Consultant for Innovation at OSTI, the Office of Scientific and Technical Information, in the Office of Science of the US Department of Energy. He has a PhD in logic and philosophy of science, an MA in mathematical logic, and a BS in civil engineering. In the exchanges below, he sounds [to me] very much like a publishing interest lobbyist, but judge for yourself. He also turns out to have a rather curious [and to me surprising] history in environmental matters?

1.1 On Sat, Jul 20, 2013 at 3:56 PM, David Wojick wrote:

WOJICK: “The US Government is developing a green OA system for all articles based even in part on Federal funding, with a default embargo period of 12 months. The publishers have responded with a proposal called CHORUS that meets that requirement by taking users to the publisher’s website. Many of the journals involved presently have no OA aspect so this will significantly increase the percentage of OA articles when it is implemented over the next few years.”

Let us fervently hope that the US Government/OSTP will not be taken in by this publisher Trojan Horse called “CHORUS.”  It is a tripping point, not a tipping point.

If not, we can all tip our hats goodbye to Open Access — which means free online access immediately upon publication, not access after a one-year embargo.

CHORUS is just the latest successor organisation for self-serving anti-Open Access (OA) lobbying by the publishing industry. Previous incarnations have been the “PRISM coalition” and the “Research Works Act.”

1. It is by now evident to everyone that OA is inevitable, because it is optimal for research, researchers, research institutions, the vast R&D industry, students, teachers, journalists and the tax-paying public that funds the research.

2. Research is funded by the public and conducted by researchers and their institutions for the sake of research progress, productivity and applications — not in order to guarantee publishers’ current revenue streams and modus operandi: Research publishing is a service industry and must adapt to the revolutionary new potential that the online era has opened up for research, not vice versa!

3. That is why both research funders (like NIH) and research institutions (like Harvard) — in the US as well as in the rest of the world — are increasingly mandating (requiring) OA: See ROARMAP.

4. Publishers are already trying to delay the potential benefits of OA to research progress by imposing embargoes of 6-12 months or more on research access that can and should be immediate in the online era.

5. The strategy of CHORUS is to try to take the power to provide OA out of the hands of researchers so that publishers gain control over both the timetable and the insfrastructure for providing OA.

6. And, without any sense of the irony, the publisher lobby (which already consumes so much of the scarce funds available for research) is attempting to do this under the pretext of saving “precious research funds” for research!

7. It is for researchers to provide OA, and for their funders and institutions to mandate and monitor OA provision by requiring deposit in their institutional repositories — which already exist, for multiple purposes.

8. Depositing in repositories entails no extra research expense for research, just a few extra keystrokes, from researchers.

9. Institutional and subject repositories keep both the timetable and the insfrastructure for providing OA where it belongs: in the hands of the research community, in whose interests it is to provide OA.

10. The publishing industry’s previous ploys — PRISM and the Research Works Act — were obviously self-serving Trojan Horses, promoting the publishing industry’s interests disguised as the interests of research.
Let the the US Government not be taken in this time either.

[And why does the US Government not hire consultants who represent the interests of the research community rather than those of the publishing industry?]

Eisen, M. (2013) A CHORUS of boos: publishers offer their ?solution? to public access

Giles, J. (2007) PR’s ‘pit bull’ takes on open access. Nature 5 January 2007.

Harnad, S. (2012) Research Works Act H.R.3699: The Private Publishing Tail Trying To Wag The Public Research Dog, Yet AgainOpen Access Archivangelism 287 January 7. 2012

1.2 On Sat, Jul 20, 2013 at 9:46 PM, David Wojick wrote:

WOJICK: “NIH uses a 12 month embargo and that is what the other Federal agencies are required to do, unless they can justify a longer or shorter period for certain disciplines. This has nothing to do with the publishers or CHORUS. The publishers are building CHORUS so that the agencies will use the publisher’s websites and articles instead of a redundant repository like NIH uses. They are merely agreeing to the US Governments requirements, while trying to keep their users, so there is no Trojan horse here, just common sense. Immediate access is not an option in this Federal OA program. The OA community should be happy to get green OA.”

1. The embargo length that the funding agencies allow is another matter, not the one I was discussing. (But of course the pressure for the embargoes comes from the publishers, not from the funding agencies.)

2. The Trojan Horse would be funding agencies foolishly accepting publishers’ “CHORUS” invitation to outsource author self-archivingand hence compliance with the funder mandate — to publishers, instead of having fundees do it themselves, in their own institutional repositories.

3. To repeat: Delayed Access is not Open Access — any more than Paid Access is Open Access. Open Access is immediate, permanent online access, toll-free, for all.

4. Delayed (embargoed) Access is publishers’ attempt to hold research access hostage to their current revenue streams, forcibly co-bundled with obsolete products and services, and their costs, for as long as possible. All the research community needs from publishers in the OA era is peer review. Researchers can and will do access-provision and archiving for themselves, at next to no cost. And peer review alone costs only a fraction of what institutions are paying publishers now for subscriptions.

5. Green OA is author-provided OA; Gold OA is publisher-provided OA. But OA means immediate access, so Delayed Access is neither Green OA nor Gold OA. (Speaking loosely, one can call author-self-archiving after a publisher embargo “Delayed Green” and publisher provided free access on their website after an embargo “Delayed Gold,” but it’s not really OA at all if it’s not immediate. And that’s why it’s so important to upgrade all funder mandates to make them immediate-deposit mandates, even if they are not immediate-OA mandates.)

WOJICK: “if delayed access is not open access in your view then why did you post the tipping point study, since it includes delayed access of up to 5 years? Most people consider delayed (green) access to be a paradigm of open access. That is how the term is used. You are apparently making your own language.”

That is the way publishers would like to see the term OA used, paradigmatically. But that’s not what it means. And I was actually (mildly) criticizing the study in question for failing to distinguish Open Access from Delayed Access, and for declaring that Open Access had reached the “Tipping Point” when it certainly has not — specifically because of publisher embargoes. [Please re-read my summary, still attached below: I don’t think there is any ambiguity at all about what I said and meant.]

But OA advocates can live with the allowable funder mandate embargoes for the time being — as long as deposit is mandated to be done immediately upon acceptance for publication, by the author, in the author’s institutional repository, and not a year later, by the publisher, on the publisher’s own website. Access to the author’s deposit can be set as OA during the allowable embargo period, but meanwhile authors can provide Almost-OA via their repository’s facilitated Eprint Request Button.

The Immediate-Deposit/Optional-Access (ID/OA) Mandate: Rationale and Model

Public Access to Federally Funded Research (Response to US OSTP RFI)
Comments on Proposed HEFCE/REF Green Open Access Mandate

1.3 On Sun, Jul 21, 2013 at 7:57 AM, David Wojick  wrote:

WOJICK: “I think what the US Government is actually doing is far more important as an OA tipping point.”

We are clearly not understanding one another:

Yes, the US funder mandates are extremely important, even if  they still need a tweak (as noted).

Yes, OA has not yet reached a tipping point. (That was my point.)

But no, Delayed Access is definitely not OA, let alone Green OA, although that is how publishers would dearly love to define OA, and especially Green OA.

WOJICK: “As for your Trojan horse point (#2) there is no author archiving with CHORUS.”

Yes, that’s the point: CHORUS is trying to take author self-archiving out of the hands  and off the sites of the research community, to put it in the hands and on the site of publishers. That is abundantly clear.

And my point was about how bad that was, and why: a Trojan Horse for the research  community and the future of OA.

But the verb should be CHORUS “would be,” not CHORUS “is” — because, thankfully, it is not yet true that this 4th publishers’ Trojan Horse has been allowed in at all. 

(The 1st Trojan Horse was Prism: routed at the gates. The 2nd was the “Research Works Act; likewise routed at the gates. The 3rd was the Finch Report: It slipped in, but concerted resistance from OA Advocates and the research community has been steadily disarming it. The 4th publisher Trojan Horse is CHORUS, and, as noted, OA Advocates and the research community are working hard to keep it out!)

WOJICK: “The author merely specifies the funder from a menu during the journal submission process and the publisher does the rest. Thus there is no burden on the authors and no redundant repository. The article is openly available from the publisher after the Federally specified embargo period. This is extremely efficient compared to the old NIH repository model.”

Indeed it would be, and would put publishers back in full control of the future of OA.

Fortunately, the CHORUS deal is far from a fait accompli, and the hope (of OA advocates and the concerned research community) is that it never will be.

The only thing the “old NH repository model” (PubMed Central, PMC) needs is an upgrade to immediate institutional deposit, followed by automatic harvesting and import (after the allowable embargo has elapsed) by PMC or any other institution-external subject based harvester. With that, the OSTP mandate model would be optimal (for the time being).

David, it is not clear why the very simple meaning of my first posting has since had to be explained to you twice. I regret that I will have to take any further failures to understand it as willful, and SIGMETRICS readers will be relieved to hear that I will make no further attempt to correct it.

1.4 On Sun, Jul 21, 2013 at 12:13 PM, David Wojick wrote:

WOJICK: “This is not about author self archiving, which is a separate issue, so I see no Trojan horse.”

1. The “This” is US federal funding agency Open Access mandates.

2. The “self” is the author, who is also the fundee, the one who is bound to comply with the conditions of the funder mandate.

3. The “archiving” is making the fundee’s paper accessible free for all all on the Web

4. The “Trojan Horse” is the attempt by publishers to take this out of the hands of the author/fundee/mandatee and put it into the hands of the publisher, who is not the fundee, not bound by the mandate, and indeed has a conflict of interest with making papers free for all all on the Web.

5. On no account should the compliance with the funder mandate be outsourced and entrusted to a 3rd party that is not only not bound by the mandate, but in a conflict of interest with it.

WOJICK: “It is about the design of the Federal program, where I see no reason for redundant Federal archiving.”

The web is full of “redundant archiving”: the same document may be stored and hosted on multiple sites. That’s good for back-up and reliability and preservation, and part of the way the Web works. And it costs next to nothing — and certainly not to publishers. (If publishers wish to save federal research money, let them charge less for journal subscriptions; don’t fret about “redundant archiving.”)

PubMed Central (PMC) is a very valuable and widely used central search tool. Its usefulness is based on both its scope of coverage (thanks to mandates) and on its metadata quality. It borders on absurdity for publishers to criticize this highly useful and widely used resource as “redundant.” It provides access where publishers do not.

Nor does PMC’s usefulness reside in the fact that it hosts the full-texts of the papers it indexes. It’s the metadata and search capacity that makes PMC so useful. It would be equally useful if the URL for each full-text to which PMC pointed were in each fundee’s own institutional repository, and PMC hosted only the metadata and search tools. (Indeed, it would increase PMC’s coverage and make it even more economical; many of us are hoping PMC and other central repositories like Arxiv will evolve in that direction.)

WOJICK: “There is nothing in the CHORUS approach to the Federal program design that precludes author self archiving in institutional repositories as a separate activity.”

1. “This” is about US federal funding agency Open Access mandates.

2. The “self” is the author, who is also the fundee, the one who is bound to comply the with conditions of the funder mandate. 

3. The “archiving” is making the fundee’s paper accessible free for all all on the Web. If authors self-archived of their own accord, “as a separate activity,” there would have been no need for federal Open Access mandates.

4. The “Trojan Horse” is the attempt by publishers to take this out of the hands of the author/fundee/mandatee and put it into the hand of the publisher, who is not the fundee, not bound by the mandate, and indeed has a conflict of interest with making papers free for all all on the Web.

5. On no account should the compliance with the funder mandate be outsourced and entrusted to a 3rd party that is not only not bound by the mandate, but in a conflict of interest with it.

The federal mandates do not require fundees to provide toll-free access only after a year after publication: They require them to provide toll-free access within a year at the latest. Publishers have every incentive to make (and keep) this the latest, by taking self-archiving out of authors’ hands and doing it instead of them, as late as possible.

Moreover, funder OA mandates are increasingly being complemented by institutional OA mandates, which cover both funded and unfunded research. This is also why institutions have institutional repositories (archives), in which their researchers can deposit, and from which central repositories can harvest. This is also the way to tide over research needs during OA embargoes, with the help of institutional repositories’ immediate Almost-OA Button.

And again, no need here for advice from publishers, with their conflicts of interest, on how institutions can save money on their “redundant archives” by letting publishers provide the OA in place of their researchers (safely out of the reach of institutional repositories’ immediate Almost-OA Button).

WOJICK: “The journals are part of the research community and they have always been the principal archive.”

Journals consist of authors, referees, editors and publishers. Publishers are not part of the research community (not even university or learned-society publishers); they earn their revenues from it.

Until the online era, the “principal archive” has been the university library. In the online era it’s the web. The publisher’s sector of the web is proprietary and toll-based. The research community’s sector is Open Access.

And that’s another reason CHORUS is a Trojan Horse.

WOJICK: “With CHORUS they will be again.”

What on earth does this mean? That articles in the publishers’ proprietary sector will be opened up after a year?

That sounds like an excellent way to ensure that they won’t ever be opened up any earlier, and that mandates will be powerless to make them open up any earlier.

WOJICK: “After all the entire process is based on the article being published in the journal.”

Yes, but what is at issue now is not publishing but access: when, where and how?

WOJICK: “It is true that this is all future tense including the Federal program, but the design principles are here and now.”

And what is at issue here is the need to alert the Federal program that it should on no account be taken in by CHORUS’s offer to “let us do the self-archiving for you.

WOJICK: “I repeat, immediate access is not a design alternative. The OSTP guidance is clear about that. So most of your points are simply irrelevant to the present situation.”

The federal mandates do not require fundees to provide toll-free access only after a year after publication: They require them to provide toll-free access within a year at the latest. 

Immediate OA (as well as immediate-deposit plus immediate Almost-OA via the Button) is definitely an alternative — as well as a design alternative.

But not if OSTP heeds the siren call of CHORUS.

1.5 On Sun, Jul 21, 2013 at 3:01 PM, David Wojick wrote:

WOJICK: “There is no funder mandate on authors at this point, so you are assuming a burdensome model that need not be implemented.”

Right now, there is a presidential (OSTP) directive to US federal funding agencies to mandate (Green) OA.

It is each funding agency that will accordingly design and implement its own Green OA mandate, as the NIH did several years ago.

WOJICK: “The mandate (requirement) will, as always, be on the fundees: the authors of the articles that are to be made OA, as a condition of funding.”

The only mandate is on the Federal funding agencies to provide public access to funder-related articles 12 months after publication.

The presidential (OSTP) directive is to the US federal funding agencies to mandate (Green) OA, meaning that all published articles resulting from the research funded by each agency must be made OA — within 12 months of publication at the latest.

The articles are by fundees. The ones bound by the mandates are the fundees. Fundees are the ones who must make their research OA, as a condition of funding.

WOJICK: “CHORUS does this in a highly efficient manner, rendering an author mandate unnecessary.”

CHORUS does nothing. It is simply a proposal by publishers to funding agencies. 

And to suggest that the the reason funding agencies should welcome the CHORUS proposal is efficiency is patent nonsense.

To comply with their funder’s requirements, fundees must specify which articles result from the funding. The few fundee keystrokes for specifying that are exactly the same few fundee keystrokes for self-archiving the article in the OA repository.

No gain in efficiency for funders or fundees in allowing publishers to host and time the OA: just a ruse to allow publishers to retain control over the time and place of providing OA.

Because of the monumental conflict of interest — between publishers trying to protect their current revenue streams and the research community trying to make its findings as soon as widely as possible — control over the time and place of providing OA should on no account be surrendered by funders and fundees to publishers.

WOJICK: “Search is no problem as there are already many ways to search the journals.”

And there are also already many ways to search OA articles on the web or in repositories.

So, correct: Search is no problem, and not an issue. In fact, it’s a red herring.

What is really at issue is: in whose hands should control over the time and place of providing OA be?

Answer: Funders and their fundees, not publishers.

WOJICK: “DOE PAGES, described in the first article I listed in my original post, is a model of an agency portal that is being designed to use CHORUS. It will provide agency-based search as well. CHORUS as well will provide bibliographic search capability.”

To repeat: The same functionality (and potentially much more and better functionality) is available outside the control of publishers too, via the web, institutional repositories, harvesters, indexers and search engines.

The only thing still missing is the OA content. And that’s what publishers are trying to hold back as long as possible, and to keep in their own hands.

WOJICK: “We simply do not need a new bunch of expensive redundant repositories like PMC.”

And the research community simply does not need to cede control over the locus and timetable of providing OA to publishers. 

WOJICK: “I am also beginning to wonder about your Trojan horse metaphor. The Trojan horse is a form of deception, but there is no deception here, just a logical response to a Federal requirement, one that keeps a journal’s users using the journal. The publishers are highly motivated to make CHORUS work.”

CHORUS is all deception (and perhaps self-deception too, if publishers actually believe the nonsense about “efficiency” and “expense”), and the “logic” is that of serving publishers’ interests, not the interests of research and researchers.

The simple truth is that the research community (researchers and their institutions) are perfectly capable of providing Green OA for themselves, cheaply and efficiently, in their own institutional OA repositories and central harvesters — and that this is the best way for them to retain control over the time and place of providing OA, thereby ensuring that 100% immediate OA is reached as soon as possible.

Letting in the publishers’ latest Trojan Horse, CHORUS, under the guise of increasing efficiency and reducing expense, would in reality be letting publishers maximize Delayed Access and fend off universal Green OA in favor of over-priced, double-paid (and, if hybrid, double-dipped) Fools Gold OA, thereby locking in publishers’ current inflated revenue streams and inefficient modus operandi for a long time to come, and embargoing OA itself, instead of making publishing — a service industry — evolve and  adapt naturally to what is optimal for research in the online era.

2.0 Research Community Interests
and the
Publishing Lobby’s Latest Trojan Horse (CHORUS)

2.1 On Mon, Jul 22, 2013 at 2:49 PM, David Wojick wrote:
WOJICK: “The Federal OA program is controlled by the Federal Government, so all your talk of ceding control is just a rhetorical device. Neither the fundees, the institutions nor the journals control it, except to the extent that the journals make the publication decisions. So no one is ceding control to anyone. And I repeat that the journals are part of the community, a central part. (You are doing your private language thing again. You do it a lot.)

“Under CHORUS the lead author merely has to check a box indicating the funder. The institutions have to do nothing more, nor does the fundee. The journal then gives the article link to the agency and makes the article publicly available at the agency controlled time. This is enormously simpler than creating repositories that fundees have to populate and funders have to work with (and someone has to build and maintain). In essence the article is published and the agency links to it. That is all and it cannot be any simpler than this. Creating a parallel universe of redundant repositories must be more complex, costly and burdensome.”

As far as I know, the publishers’ CHORUS deal that you describe (and that I have referred to in my not-so-private language argument as a Trojan Horse) has not yet been accepted by the Federal Government, nor by its funding agencies.

Maybe they will accept it, maybe they won’t. I and many others have been describing the many reasons they should not accept it.

You are repeating arguments about the redundancy and complexity and costliness of repositories to which I and many others have already replied. 

But I am not trying to persuade you that researchers using their keystrokes to deposit in OA repositories is better for research and for OA than letting publishers do it for them: The ones I and many others are trying to persuade of that are the same ones that you and the rest of the publisher lobby are trying to persuade of the opposite: the Federal government and its research funding agencies.

May the best outcome (for the research community) win.

I want to close by reminding inquiring readers of just one of the many points that David Wojick and the other CHORUS lobbyists keep passing over in silence:

The Government directive is not to make funded research freely accessible 12 months after publication but within 12 months of publication

The publishers’ Trojan Horse would not only take mandate compliance out of the hands of fundees, making compliance depend on publishers rather than fundees, but it would also ensure that the research would not be made freely accessible one minute before the full 12 months had elapsed.

If I were a publisher, interested only in protecting my current income streams, come what may, I’d certainly lobby for that, just as I would lobby for the untrammelled cigarette ads and zones, if I were a tobacco company, interested only in protecting my current income streams, come what may; or for the untrammelled manufacture and use of plastic bags, if I were a plastic bag company with similar “community” interests.

CHORUS is a terrific way of locking in publisher embargoes and Delayed Access for years and years to come, thereby leaving payment for Fools Gold as the sole option for providing immediate OA. 

(Shades of Finch — and RWA, and PRISM… The publishing lobby is a “part” of the research “community” indeed, heroically defending “our” joint interests! I’m ready for the usual next piece of rhetoric, about how un-embargoed Green OA would destroy journal publishing, and with it peer review and research quality and reliability… We’ve heard it all, many times over, for close to 25 years now…)

2.2 On Tue, Jul 23, 2013 at 8:06 AM, David Wojick wrote:
WOJICK: “What Federal system design arguments have I not responded to?”

Here are the first few arguments you have not responded to. (I have no idea what you are attempting to sector off under the guise of responding only to “Federal system design” arguments):

1. that mandates are for public access within up to a year whereas CHORUS would provide it only at the very end

2. that OA mandates are intended to require authors to provide OA whereas CHORUS would take it out of authors’ hands entirely (thereby mooting mandate compliance altogether, let alone earlier or wider compliance).

3. that repository deposit facilitates providing eprints during any OA embargo with the repository’s eprint-request Button whereas CHORUS prevents it

4. that CHORUS locks in 1-year embargoes and puts and leaves publishers in control of both the hosting and the timetable for public access

5. that repository costs are small and mostly already invested, and for multiple uses, hence CHORUS would not save money but rather waste repositories

I have more, but that should be fine for a start…

WOJICK: “It is not an ad hominem to point out that the Federal policy is not anti-publisher, as many OA advocates are.”

I for one am not anti-publisher.  But I’m  very definitely against publisher anti-OA-mandate lobbying and I’m also against publisher embargoes on Green OA. 

Apart from that, I have a long history of defending publishers against overzealous OA advocates or overpricing plainants — as long as they were on the “side of the angels,” by endorsing immediate, unembargoed Green, as Springer and Elsevier did for many years.

The gloves came off when publishers started trying to renege on their prior endorsements of immediate Green.

WOJICK: “It is an important fact about the policy. I have to be repetitive because Harnad is presenting the same non-design arguments over and over.”

I have no idea what you mean by “non-design” arguments. The points above are against CHORUS as a means of implementing the funding agencies’ Green OA mandate, that’s all.

WOJICK: “Arguments such as that publishers cannot be trusted…”

I have not said that. I said that compliance with funders’ mandates on fundees to provide OA to their funded research should on no account be entrusted to publishers because of the obvious conflict of interest: The interest of research and researchers is that research should be OA immediately; the interest of publishers is that access should be access should be delayed for as long as possible (one year, within the “design” of the OSTP directive).

I fully trust that publishers would faithfully make articles publicly accessible — on the very last day of the maximal allowable OA embargo

WOJICK: “[Arguments such as that] access should be immediate via institutional repositories?”

I don’t just repeat that over and over: I give the reasons why: Because Open Access means Open Access, and the reasons that make Open Access important at all make it important immediately upon publication, not 12 months later.

And it’s institutional repositories because institutions are the providers of all research, funded and unfunded, in every discipline. Institutions have already created OA repositories. They have many reasons for wanting to archive, manage and publicly showcase their own research output in their own repositories — over and above the reasons for OA itself (maximizing research uptake, usage, applications, impact and progress). 

And institutions themselves are also beginning to mandate Green OA. Hence funder and institutional mandates should be convergent and mutually reinforcing. All research should be deposited in the institutional repository immediately upon acceptance for publication. (Their metadata and URLs can then be harvested by whatever central access points, databases, indices and search engines disciplines wish to create.) 

And if the author wishes to comply with a publisher embargo, access to the deposit can be set as Closed Access instead of Open Access during the embargo, in which case the repository’s eprint-request Button can provide Almost-OA during the embargo (while embargoes last — which will not be long, one hopes, once mandatory Green OA has become universal).

All of these benefits are lost if publishers are in control of providing public access on their sites, a year after publication.

WOJICK: “[Arguments such as that] delayed access is not open access, etc. My response does not vary.”

Delayed access means losing a year of Open Access. Your response does not vary because the publisher lobby is interested in minimizing, not maximizing Open Access. If the maximal allowable delay is 12 months, publishers will happily make sure it is no less than 12 months, and on their site, with no Almost-OA Button to tide over the embargo, no integration with institutional mandates, and authors entirely out of the compliance loop for mandates that are intended to generate as much OA as possible, as soon as possible.

My own response varies as much as possible, in an effort — each time – to present from every angle the case for implementing OA mandates in such a way as to provide the maximum benefit to research and researchers, rather than just to protect the proprietary interests of publishers at the expense of research. researchers, and the public that funds them.

2.3 On Tue, Jul 23, 2013 at 9:47 AM, David Wojick wrote:

WOJICK: “I have already responded to these points. The publisher’s self interested motivation is to keep the web traffic to its journals.”

At the expense (to research and researchers) of impeding the growth of OA and OA mandates and ensuring that the allowable embargo length is always the maximum 12 months. (“For immediate-OA, please pay the Fools-Gold OA fee!|)

WOJICK: “Studies suggest [publishers] are losing 20% to PMC.”

And while publishers’ download sites have lost the traffic, research has gained a great deal of functionality, as well as OA.

WOJICK: “The publishers believe this, whether it is true or not, thus their motivation.”

Their motivation is in no doubt. But the issue is not what is best for publishers but what is best for research, researchers and the public that funds them.

WOJICK: “The mandate is that the articles be made publicly accessible and the articles are the publisher’s so they are not third party contractors, whatever that might mean.”

My articles are my publisher’s, not mine?

I think you might mean that the publishers are the holders of the copyright, or exclusive vending rights.

Well we’re talking about a mandate here — by the party of the second part, the author’s funder, requiring the party of the first part, the author, to make the research they’ve funded publicly accessible within a year of publication at the very latest.
That’s a condition of a contract the author must sign before ever doing the research, let alone signing any subsequent contract with any party of the third part regarding vending rights.

WOJICK: “The fundees need play no role.”

The fundees play no role? No role in what? The funder mandates bind the fundees, not some other party.

WOJICK: “The publishers are making a ground breaking concession by agreeing to the Federal embargo deadlines.”

Agreeing? It seems to me they don’t have much choice! Who are publishers conceding to? And conceding what?

If this is publisher largesse rather than federal government duress I would really like to know to what we owe their newfound magnanimity…

WOJICK: “This is great news for OA. I have no idea what you mean by letting them sit. They will be on view in their on-line journals, which is arguably where they belong.”

I think Cristóbal Palmer’s “let[ting] them sit” may have been an ill-chosen descriptor, but I can still make sense of it:

Ceding the provision of public access to the publisher’s site and the publisher’s timetable means that research must sit for 12 months, accessible only to subscribers, even though the mandate states that they must be made publicly accessible within 12 months at the latest. Fundees could have deposited them in repositories immediately, and made them publicly accessible earlier, or, if they wished to comply with a publishers embargo, made them immediately Almast-OA, via the repository’s Button, instead of sitting inaccessibly for 12 months.

And before you reply “fundees can still do that if they want to,” let me remind you of the fundamental purpose of Green OA mandates: It’s to get authors to provide OA. Without them, they don’t. Not because they don’t want to. But because without a mandate from their funders or institutions, they dare not: because of fear of their publishers.” The mandate releases authors from that fear.

And the CHORUS variant — in which “the fundee has no role” — would leave authors stuck in that fear, contractually unprotected by a funder mandate, and would render the funder policy empty and ineffectual beyond its absolutely minimum requirement, which is public access after 12 months (but not a moment before).

And that would of course suit publishers just fine. In fact, maybe that’s the reason for their newfound magnanimity: “Concede” on public access after a 12-month embargo, take control of hosting and providing it, and maybe that pesky global clamor for immediate OA will go away — or, better, redirect authors toward the Fools Gold counter where they pay hybrid publishers for immediate OA.

WOJICK: “The repository approach made sense when the publishers refused to provide access. That day has passed.”

Don’t bank on it. The clamor for access is growing and growing. And that’s immediate Open Access, not publisher-Delayed Access after 12 months.

Stevan Harnad

SPARC Webcast- Tools for Open Access Advocacy: Demystifying Hackathons

Tools for Open Access Advocacy: Demystifying Hackathons

Another free SPARC online event

Tuesday, August 6th, 2013

12:00 – 1:00PM EDT (use helpful time converter)

Registration is free, but required. Please RSVP by August 1st.

This webcast requires both a phone dial-in and an Internet connection.

Does the very word “Hackathon” have you crawling to the nearest corner, with visions of computer code dancing in your head? Have you ever wondered what exactly they are, and why you should care? Hackathons can be a great tool, bringing together groups of people to complete a set goal using the combination of their skills – computer-based and otherwise. They are not just for the technologists – your individual expertise can be a vital part of a “hack.” Open Access Week is now just around the corner (October 21-27) and a Hackathon is a great way to stir interest, involvement, and possibly create finished projects using Open Access content.

Our guest speaker, Brian Glanz, is the founder of the Open Science Federation and co-founder of the American chapter of the Open Knowledge Foundation. With both organizations, Brian has lots of experience participating in and deploying Hackathons where Open Access content played a critical part. For two recent examples, he points to the over 100 events associated with in June, and July’s in the scholarly publishing community.

Brian will fully explain what Hackathons are, how you deploy them, and why we in the library community should be participating in and utilizing them.

To accommodate interest in every time zone, this 1-hour event will be recorded and available on our website shortly afterwards.

Please join us for a lively and interactive discussion. SPARC’s Executive Director, Heather Joseph, will be moderating questions during the webcast. Feel free to post preliminary comments and questions for Brian right here.

For additional information, contact SPARC’s Communication’s Manager, Andrea Higginbotham at andrea [at] arl [dot] org.

#openscience in Oxford

I/we had a great evening on Wednesday at the Open Science meeting run by Jenny Molloy and colleagues . I was leading the meeting on “content mining” and we had about 12 attendees including bioscientists, librarians, physicist, informatics, etc. It was very informal and we started by talking abour our own interests and then I gave some demos and introduction on content mining.

Jojo Scobie @paraphyso took this picture of Chuff @okfn_okapi in the pub

I was delighted to see the interest and involvement of the group in phylogenetics. At least half could be described as having a significant interest or practice in the area. So we were able to look in depth at the sort of science published and to explore the issues, both technical and organizational. And they were forgiving of my ignorance and spent a long time educating me!

I’ve discussed much of the basis before, but in essence Ross Mounce and I will be extracting data from PDF publications and systematically publishing it. We looked at the things we would like to extract. There was an important discussion on whether extracting the single tree from a paper was valuable – the authors should publish a much fuller amount of data so 1 tree isn’t always a good representation of the result. But we generally agreed it was a lot better than zero.

We discussed the value of indexing the literature by species and here there was great agreement – if the scientific literature were indexed by species (and possible geodata and dates as well) that would be really valuable – and it’s technically about the simplest example of high-quality content-mining.

Our species are in danger see which reports “The workshop highlighted that the okapi is faring worse than scientists previously thought”. And there are 100+ species in the highly critical list. Finding all the published information on species is an essential (but not sufficient) activity and it should be possible for anyone anywhere in the world to get all the peer-reviewed and grey literature on a species. Content-mining is a necessary approach.

There is obviously a critical mass of interest in expertise in Oxford – supported by Jenny’s tireless efforts. We proposed we should have a hackathon on “species” – we could make a lot of progress.

Search Engine Optimization and Your Journal Article: Do you want the bad news first?

SEO-pic-300x253Here’s the bad news about search engine optimization and your paper:  it’s going to mean a bit more work.

Yep.  It’s not enough that you hustled for funding, figured out who your co-authors would be, conducted the research, wrote the paper, decided where to submit it, hoped that it would be accepted, made necessary revisions, and waited anxiously for it go up online.

Nope.  Now, before having your article posted online you have to make sure that your article is prepared for the real world: the digital world.     You need to ensure that your paper is search engine optimized.  To quote Zhang and Dimitroff: “Search Engine Optimization (SEO) … is the process of identifying factors in a webpage which would impact search engine accessibility to it and fine-tuning the many elements of a website so it can achieve the highest possible visibility when a search engine responds to a relevant query. Search engine optimization aims at achieving good search engine accessibility for webpages, high visibility in search engine results, and improvement of the chances the webpages are retrieved.”(1)

Except, to put a filter on it, replace the word “webpage(s)” in the quote above with “journal article”.  A little daunting, no?

Now here’s the good news:  it’s worth the effort.  After all, why go to all the toil of authoring an article if your research is going to be buried on page 275 of Google or Google Scholar’s search results?  Scholarly information is increasingly more accessible online, but not inherently more discoverable.  Employing SEO can leverage a paper so that it has better odds of being at the top of search results, and, therefore, better odds of being read and even cited.  Moreover, if you are publishing open access, you will also be getting the best value for your (or your funder’s) money if your research is easily accessible via search engines.

“Isn’t that the journal’s (or publisher’s) job?” you might ask?  Well yes and no.  Journals and publishers need to make sure they do everything they can to optimize their online platforms so that search engines can easily crawl and index content.  58% of all traffic to our online platform, Wiley Online Library, comes from search engines (predominantly Google and Google Scholar).  And publishers need to actively promote journals and featured content in a crowded online space.  However, they do not have ultimate control over the discoverability of content at the article level. You do.

So what do you need to do?  We’ve created an SEO for Authors tips sheet to give authors an at-a-glance guide to optimizing their papers.  Here are some highlights:

  • Carefully select relevant keywords
  • Lead with keywords in the article title
  • Repeat keywords 3-4 times throughout the abstract
  • Use headings throughout the article
  • Include at least 5 keywords and synonyms in the keyword field
  • Link to the published article on social media, blogs and academic websites

A lot of this boils down to selecting appropriate keywords (i.e. search terms) and using them frequently and appropriately, because, “Generally speaking, the more often a search term occurs in the document, and the more important the document field is in which the term occurs, the more relevant the document is considered.”(2)

This shouldn’t be a completely daunting process or even that much additional work.  It is really about being more mindful, as you are writing the paper, of how users will search and find the published version online.

Happy Optimizing!

By Anne-Marie Green,
Marketing Manager


1. Zhang, Jin, and Alexandra Dimitroff. “The Impact of Metadata Implementation on Webpage Visibility in Search Engine Results (Part II).” Information Processing & Management 41.3 (2005): 697-715.

2. Beel, Jran, Bela Gipp, and Erik Wilde. “Academic Search Engine Optimization (ASEO): Optimizing Scholarly Literature for Google Scholar & Co.” Journal of Scholarly Publishing 41.2 (2010): 176-90.

HEFCE Open Access Mandate Not Narrower: Better Focused

“The UK funding councils have narrowed the scope of their proposed open access mandate for the post-2014 research excellence framework.”
— (Paul Jump, “Open access mandate narrowed in formal proposalsTimes Higher Education)

1. Model. The HEFCE proposal to mandate immediate (not retrospective) deposit of journal articles in the author’s institutional repository in order to make them eligible for evaluation in the next Research Excellence Framework (REF) is wise and timely, and. if adopted, will serve as a model for the rest of the world. It will also complement the Green (self-archiving) component of the RCUK Open Access (OA) mandate, providing it with an all-important mechanism for monitoring and ensuring compliance.

2. Monographs. Exempting monographs for now was a good decision. The HEFCE mandate, like the RCUK mandate, applies only to peer-reviewed journal articles. These are all author giveaways, written solely for research impact, not royalty income. This is not true of all monographs. (But a simple compromise is possible: recommend — but don’t require — monograph deposit too, but with access set as Closed Access rather than Open Access, with no limit on the length of the OA embargo. Author choice.)

3. Data. Ditto for open data: It’s good judgment not to force it on researchers. Researchers must be allowed a fair period of first-expoitation rights on the data they have gathered. If it’s immediately open to all, why bother to gather data data at all? Just analyze the data of others immediately after they take the time and trouble to gather it. (But here too, a simple compromise would be to recommend — but not require — Closed Access deposit. Eventually, fair embargo length limits can be decided, on a discipline by discipline and project by project basis.)

4. Exceptions. The required compliance rate has not been reduced from 100% to 60-75% (and should not be). HEFCE is merely asking in the consultation, whether the research community prefers a reduced target percentage or case-by-case consideration of exceptions. The latter is a far better way of making the policy realistic and successful. Most of the notional reasons for non-compliance (e.g., publisher embargoes) are based on misunderstandings anyway. (Articles can be deposited immediately, even if there is a publisher OA embargo: access to the immediate-deposit can be set as Closed Access instead of OA during the embargo.) Percentage-targets would simply ensure that compliance rates were no higher than the allowable percentages.

5. Embargoes. The HEFCE mandate moots OA embargoes because it requires immediate deposit, whether or not access is immediately OA. This is the core reason the HEFCE mandate is so very important and provides an optimal mandate model for the rest of the world: Publisher OA embargoes no longer determine whether and when an article is deposited. And the institutional repositories have an eprint request Button with which individual users wordlwide can request a single copy of a Closed Access article for research purposes with one click; and the author can choose to comply or not comply with one click. This tides over research needs during any allowable OA embargo with “Almost-OA.”

6. Licenses. Once the allowable embargo (if any) elapses, any OA deposit can be accessed, read, searched, linked, downloaded, stored, printed off and locally data-mined by any user webwide. It will also be harvested and indexed for Boolean full text search by engines like Google. No further license is needed for any of this. Further re-use rights will come once effective Green OA mandates on the combined HEFCE/RCUK model are adopted globally by funders and institutions worldwide. Universal Green OA will also hasten the inevitable natural demise of all remaining OA embargoes.

7. Start-Date. The HEFCE consultation also inquires about when the mandate should start, and contemplates a grace period of two years, from 2014-2016. But there is really no reason why an immediate-deposit mandate should not start immediately after REF 2014 for authors at UK institutions, for any article accepted after that date: Everyone begins preparing for the new REF the day after the old REF anyway.

8. Date-Stamp: Only one of the consultation questions is critical for the success of the HEFCE mandate model, and that is whether the requirement that the deposit be “immediate” refers to the date of publication or the date of acceptance for publication. It is crucially important that the date should be acceptance, not publication. Acceptance date is marked by a determinate date-stamped acceptance letter and is a natural point for deposit in the author’s workflow. Authors usually don’t even know when their accepted article will appear, or has appeared; the lag may be months or even years from acceptance. Nor is the date on the journal issue a marker, because issues often appear long after their calendar dates. Publication lags can be even longer than OA embargoes! Meanwhile, precious access and impact are being lost. The HEFCE immediate-deposit mandate will only succeed if it is pegged to the determinate acceptance date rather than the indeterminate publication date.

Ecology and Evolution has received its first Impact Factor of 1.184, ranking 99/136 in Ecology!

ECE 3 7Following on from the release of Ecology and Evolution’s first Impact Factor, the latest issue of Ecology and Evolution is now live! Over 30 excellent articles free to read, download and share. The cover image has been taken from the article Inbreeding reveals mode of past selection on male reproductive characters in Drosophila melanogaster by Outi Ala-Honkola et al. Below are some highlights from this issue:

 purple_lock_open An age–size reaction norm yields insight into environmental interactions affecting life-history traits: a factorial study of larval development in the malaria mosquitoAnopheles gambiae sensu stricto by Conan Phelan and Bernard D. Rotiberg
Summary: Environmental factors frequently act nonindependently to determine growth and development of insects. Because age and size at maturity strongly influence population dynamics, interaction effects among environmental variables complicate the task of predicting dynamics of insect populations under novel conditions. We reared larvae of the African malaria mosquito Anopheles gambiae sensu stricto (s.s.) under three factors relevant to changes in climate and land use: food level, water depth, and temperature. Each factor was held at two levels in a fully crossed design, for eight experimental treatments. Larval survival, larval development time, and adult size (wing length) were measured to indicate the importance of interaction effects upon population-level processes. For age and size at emergence, but not survival, significant interaction effects were detected for all three factors, in addition to sex. Some of these interaction effects can be understood as consequences of how the different factors influence energy usage in the context of a nonindependent relationship between age and size. Experimentally assessing interaction effects for all potential future sets of conditions is intractable. However, considering how different factors affect energy usage within the context of an insect’s evolved developmental program can provide insight into the causes of complex environmental effects on populations.

purple_lock_open  Foraging area fidelity for Kemp’s ridleys in the Gulf of Mexico by Donna J. Shaver, Kristen M. Hart, Ikuko Fujisaki, Cynthia Rubio, Autumn R. Sartain, Jaime Peña, Patrick M. Burchfield, Daniel Gomez Gamez and Jaime Ortiz
Summary: For many marine species, locations of key foraging areas are not well defined. We used satellite telemetry and switching state-space modeling (SSM) to identify distinct foraging areas used by Kemp’s ridley turtles (Lepidochelys kempii) tagged after nesting during 1998–2011 at Padre Island National Seashore, Texas, USA (PAIS;= 22), and Rancho Nuevo, Tamaulipas, Mexico (RN;= 9). Overall, turtles traveled a mean distance of 793.1 km (±347.8 SD) to foraging sites, where 24 of 31 turtles showed foraging area fidelity (FAF) over time (= 22 in USA,= 2 in Mexico). Multiple turtles foraged along their migratory route, prior to arrival at their “final” foraging sites. We identified new foraging “hotspots” where adult female Kemp’s ridley turtles spent 44% of their time during tracking (i.e., 2641/6009 tracking days in foraging mode). Nearshore Gulf of Mexico waters served as foraging habitat for all turtles tracked in this study; final foraging sites were located in water <68 m deep and a mean distance of 33.2 km (±25.3 SD) from the nearest mainland coast. Distance to release site, distance to mainland shore, annual mean sea surface temperature, bathymetry, and net primary production were significant predictors of sites where turtles spent large numbers of days in foraging mode. Spatial similarity of particular foraging sites selected by different turtles over the 13-year tracking period indicates that these areas represent critical foraging habitat, particularly in waters off Louisiana. Furthermore, the wide distribution of foraging sites indicates that a foraging corridor exists for Kemp’s ridleys in the Gulf. Our results highlight the need for further study of environmental and bathymetric components of foraging sites and prey resources contained therein, as well as international cooperation to protect essential at-sea foraging habitats for this imperiled species.

 purple_lock_open A new method for identifying rapid decline dynamics in wild vertebrate populations by Martina Di Fonzo, Ben Collen and Georgina M. Mace
Summary: Tracking trends in the abundance of wildlife populations is a sensitive method for assessing biodiversity change due to the short time-lag between human pressures and corresponding shifts in population trends. This study tests for proposed associations between different types of human pressures and wildlife population abundance decline-curves and introduces a method to distinguish decline trajectories from natural fluctuations in population time-series. First, we simulated typical mammalian population time-series under different human pressure types and intensities and identified significant distinctions in population dynamics. Based on the concavity of the smoothed population trend and the algebraic function which was the closest fit to the data, we determined those differences in decline dynamics that were consistently attributable to each pressure type. We examined the robustness of the attribution of pressure type to population decline dynamics under more realistic conditions by simulating populations under different levels of environmental stochasticity and time-series data quality. Finally, we applied our newly developed method to 124 wildlife population time-series and investigated how those threat types diagnosed by our method compare to the specific threatening processes reported for those populations. We show how wildlife population decline curves can be used to discern between broad categories of pressure or threat types, but do not work for detailed threat attributions. More usefully, we find that differences in population decline curves can reliably identify populations where pressure is increasing over time, even when data quality is poor, and propose this method as a cost-effective technique for prioritizing conservation actions between populations.

purple_lock_open Estimating resource selection with count data by Ryan M. Nielson and Hall Sawyer
Summary: Resource selection functions (RSFs) are typically estimated by comparing covariates at a discrete set of “used” locations to those from an “available” set of locations. This RSF approach treats the response as binary and does not account for intensity of use among habitat units where locations were recorded. Advances in global positioning system (GPS) technology allow animal location data to be collected at fine spatiotemporal scales and have increased the size and correlation of data used in RSF analyses. We suggest that a more contemporary approach to analyzing such data is to model intensity of use, which can be estimated for one or more animals by relating the relative frequency of locations in a set of sampling units to the habitat characteristics of those units with count-based regression and, in particular, negative binomial (NB) regression. We demonstrate this NB RSF approach with location data collected from 10 GPS-collared Rocky Mountain elk (Cervus elaphus) in the Starkey Experimental Forest and Range enclosure. We discuss modeling assumptions and show how RSF estimation with NB regression can easily accommodate contemporary research needs, including: analysis of large GPS data sets, computational ease, accounting for among-animal variation, and interpretation of model covariates. We recommend the NB approach because of its conceptual and computational simplicity, and the fact that estimates of intensity of use are unbiased in the face of temporally correlated animal location data.

Read other top articles in this issue >

Submit your paper to Ecology and Evolution here >

Sign up for e-toc alerts here >

Harnad Response to HEFCE REF OA Policy Consultation

[Please post your own response to the HEFCE REF OA Policy Consultation HERE]

Executive Summary:

I. The HEFCE proposal to mandate immediate repository deposit of articles as a condition for eligibility for REF is excellent. If adopted and effectively implemented, it will serve as a model for OA mandates worldwide. It will also reinforce and complement the RCUK OA mandates, providing it with a uniform compliance monitoring and verification mechanism.

II. The immediate-deposit mandate should apply to the refereed, accepted version of peer-reviewed research articles (or refereed conference articles).

III. The deposit should be in the author?s institutional repository, immediately upon acceptance for publication. Acceptance date is determinate; publication date is variable and indeterminate and may lag acceptance by as much as two years.

IV. Access to the deposit should be immediately OA where possible, or, where deemed necessary, it can be made Closed Access if the publisher requires an OA embargo.

V. Repositories should implement the eprint request Button that allows individual users to request ? and others to provide ? one copy for research purposes with one click each.

VI. Once any allowable embargo period elapses, OA deposits can be accessed, read, searched, linked, downloaded, printed out, stored, and locally data-mined by individual users, as well as harvested and indexed for Boolean search by harvesters like Google. This makes license policy less urgent. Further re-use rights will come when OA mandates have made OA universal.

VII. What is crucial is that the deposit should be made at time of acceptance, time-stamped as such, with a copy of the acceptance letter to serve as the date marker.

VIII. Unlike articles, monographs are not all author give-aways, published solely for research impact rather than royalty outcome; and researchers need to have exclusive first data-mining rights on the data they collect. So monograph and data deposit should only be recommended for the time being, not mandatory; access to the deposits can be set as Closed Access.

IX. The start date for 2020 REF eligibility should be immediately after the 2014 REF, not two years afterward.

X. The target should be 100% compliance. Exceptions can be dealt with on a case by case basis: It would be a great mistake to stipulate a percentage compliance figure instead.

Question 1: Do you agree that the criteria for open access are appropriate (subject to clarification on whether accessibility should follow immediately on acceptance or on publication)?


1.1 The HEFCE REF OA Policy should apply to the refereed, accepted version of peer-reviewed research articles or refereed conference articles.

1.2 It should be deposited in the author?s HEI repository, immediately upon acceptance for publication.

1.3 Access to the deposit should be immediately Open Access where possible, or, where deemed necessary, it can be made Closed Access if the publisher requires an OA embargo.

1.4 The crucial thing is that the deposit should be made at time of acceptance, time-stamped as such, with a copy of the acceptance letter to serve as the date marker.

The proposal is excellent. And if adopted and effectively implemented, it will serve as a model for OA policies worldwide.

Question 2a: Do you agree with the role outlined for institutional repositories, subject to further work on technical feasibility?


Fortunately, most UK HEI institutions already have institutional repositories (IRs) that are already configured, or readily configurable, to be compliant with HEFCE?s proposed policy for REF. They also already have a date of deposit tag. The dated acceptance letter can be uploaded as a supplementary document. The full text can be uploaded with access set as either Open Access or Closed Access (during an embargo, in which case the repositories also have a facilitated eprint request Button that can tide over the usage needs of UK and worldwide researchers for the deposited research during the allowable embargo).

Many HEIs are already use their IRs for submission to REF. The only change required by the HEFCE policy will be to require the deposit to be made immediately upon acceptance, rather than in batch, at the end of the year, or the end of the REF cycle. But this is the crucial core of the policy (and what will also make it an effective compliance mechanism for the RCUK Mandate as well).

The IR software is also easily configurable so researchers can keep updating their REF choices as they publish further articles, substituting a later one for an earlier one, if they judge it more suitable for REF. What is brilliant about the HEFCE proposal is that it ensures that all potentially suitable articles are deposited immediately, in order to ensure that they are eligible, even if they might later be superseded by a more suitable article.

Question 2b: Should the criteria require outputs to be made accessible through institutional repositories at the point of acceptance or the point of publication?

Deposit should definitely be required at point of acceptance rather than at point of publication, for the following reasons:

1. The point of acceptance has a definite date, with the editor?s dated letter of acceptance serving as the time marker.

2. The point of acceptance is also the natural point in the author?s workflow to do the deposit, again marked by a clear, unambiguous, dated event: the letter of acceptance for publication.

3. The date of publication is extremely vague and uncertain for journals.

4. The author does not know, at point of acceptance, when the article will be published.

5. The publication date of the article often has no calendar date.

6. The publication date usually does not correspond to the date at which an article actually appears: the article may appear earlier than the publication date, but more often it appears later, sometime very much later.

7. The author often only finds out the date of publication after the fact ? sometimes long after the fact.

8. All these possibilities are vague and uncertain, and the span of uncertainty can be from several months to two years or even more, which is even longer than most publishers? OA embargo length.

9. Hence publication date is no basis for reliably and systematically complying with a HEFCE immediate-deposit requirement by the author, nor for monitoring and ensuring fulfilment by the author?s HEI or by HEFCE.

10. A further advantage of the acceptance date is that it is earlier, and hence allows more and earlier access and usage of the funded research.

IR deposit, at point of acceptance, is a simple, clear, natural, readily implementable and verifiable procedure for the author, the HEI and HEFCE, as well as an excellent compliance verification mechanism for the RCUK OA mandate. It is also an optimal model for the rest of the research world to adopt globally. With it, HEFCE will be performing a great service not only for UK and worldwide access to UK research output, but also for UK access to the rest of the world?s research output, with an exemplary policy, suited for use by all.

Question 3a: Do you agree that the proposed embargo periods should apply by REF main panel?


The length of the embargo is far less important than the requirement to deposit in the author?s institutional repository, and to deposit immediately upon acceptance.

Embargoes should be as short as possible, but they can, if desired, be allowed to vary by discipline. The IRs have the facilitated eprint request Button to help tide over the usage needs of UK and worldwide researchers for the deposited research during the allowable embargo.

Question 3b: Do you agree with the proposed requirements for appropriate licences?


It is not clear from the documentation what these license/re-use requirements will be. I strongly urge not get bogged down in them. We are talking here about UK research output. Once it is deposited and any embargo elapses, deposits will be OA and hence can be searched, linked, downloaded, printed, stored and text-mined by individual researchers and research groups. They will also be harvested and full-text inverted for Boolean search by Google and other harvesters. All of this comes with the territory in making them Open Access, and does not require any further license.

What would require further license permissions would be the right for databases to harvest, data-mine and republish the texts. Do not get bogged down in this now, if it creates any obstacles. We are only talking about UK research output: 6% of worldwide research output. If the rest of the world adopts the HEFCE immediate-deposit requirement too, OA will become 100% globally, and all re-use rights authors wish to provide and users need will follow soon after. But it would be a needless risk to let licensing requirements hold back adoption or compliance of the HEFCE OA policy at this point. And there are discipline differences here too, potentially even bigger ones than differences in embargo length.

Go easy on licensing: It will all come after the HEFCE policy succeeds and is adopted worldwide. Don?t let licenses and re-use rights become a sticking point even before the HEFCE mandate is adopted. Access is infinitely more urgent than re-use/license needs; access needs are universal across disciplines; re-use/license needs are not. And access is a prerequisite for re-use rights, not vice versa. First things first.

Be flexible and pragmatic on licensing. Immediate IR deposit is the crucial thing.

Question 4: Do you agree that the criteria for open access should apply only to journal articles and conference proceedings for the post-2014 REF?


Refereed journal articles and refereed conference articles have from its inception been the primary targets of the worldwide Open Access movement, because they are the only form of research output that is, without exception, author giveaway content, written only for research uptake and impact, not for royalty revenue.

It is for this reason that all authors of articles will readily comply with an OA mandate: They all want their findings to be accessible to all their potential users worldwide, not just to those at institutions that can afford subscription access to the journal in which it happens to be published.

For researchers, loss of access to their work means loss of uptake, usage, applications and impact for their work. And the progress and funding of their research, as well as their careers, depend on the uptake, usage, applications and impact of their work.

Books. But all of this becomes much more complicated and exception-ridden when we move to monographs and books. Some books may fall in the same motivational framework, but many are written in hope of royalty income, so authors are not eager to give them away free for all. Also the economics of book publication entail a much bigger investment in each book by the publisher, who would likewise be reluctant to make the investment if the book was made available as an online give-away.

But there is a simple solution for books: Don?t require them to be deposited, just recommend it. And authors have the option of depositing books as Closed Access rather than Open Access, with no limit on how long they can embargo OA. (Meanwhile, if they wish, they can provide individual copies via the Button as and when they choose.)

Data. Data are complicated in another way. The problem is not potential royalties but first-exploitation rights. Researchers are not just data-gatherers. They gather data because they want to do something with it. To analyze and process it. They must be given a fair allotment of time to do this. Otherwise, if they must make their data open to all immediately, so anyone can analyze it, then they may as well not bother gathering it at all, and simply wait to analyze the data that others have taken the time and trouble to gather ? and were then obliged to make open immediately.

The moral is that if article embargo lengths and licensing needs vary from discipline to discipline, then the fair length of the period of exclusive first-exploitation rights for data varies even more, not just from discipline to discipline, but from research project to research project.

And again the solution is to encourage (but not require) depositing the data and making it open as soon as possible. But no fixed embargo lengths.

A successful HEFCE immediate-deposit policy for refereed journal and conference articles will be an enormous positive contribution, and more than enough as a first step. All the rest (re-use rights, the gradual disappearance of article OA embargoes, and the extension of OA to other kinds of content) will follow as a natural matter of course. It should not be allowed to complicate what is otherwise an extremely timely and powerful means of making UK research articles OA.

Question 5: Do you agree that a notice period of two years from the date of the policy announcement is appropriate to allow for the publication cycle of journal articles and conference proceedings?


I think two years is needlessly and unjustifiably long.

We are still now in the phase of REF 2014. As soon as that ends, researchers and HEIs begin to prepare for REF 2020.

There is no reason at all why immediate-deposit upon acceptance for articles accepted for publication starting 2014 should not begin in 2014 rather than in 2016, as a condition for REF 2020 eligibility.

Not even those HEIs that don?t yet have IRs should be exceptions: Their authors can start depositing at once in OpenDepot, the UK back-up repository designed for that purpose.

That said, there is no reason why HEFCE cannot show some flexibility in the first two years, for inadvertent failures to comply immediately. But this potential flexibility should not be publicized, for it will only encourage lax compliance during the two designated years.

Question 6: Do you agree that criteria for open access should apply only to those outputs listing a UK HEI in the output?s ?address? field for the post-2014 REF?


Every UK researcher who is submitting an article for REF should have to deposit it in their IR immediately upon acceptance (except if they came to the institution after the acceptance date).

Better to be as inclusive as possible and handle would-be exceptions on a case by case basis rather than declare explicit exceptions.

Question 7: Which approach to allowing exceptions is preferable?

I support Option a: Full compliance; exceptions considered on case by case basis, first by the HEI, and if not resolved, by the REF panel.

There will be no basis for objections by publishers to immediate-deposit in Closed Access. The embargo length for Open Access is less important (because of the Button) and will not (and should not) constrain authors? choice of journals).

External collaborators will certainly not object to Closed Access immediate-deposit, and are very unlikely to object to OA either ? and certainly not post-embargo OA.

Percentage compliance criteria would be a very bad idea, and would virtually be inviting institutions not to strive for 100%. Case-by-case handling is an infinitely better way to exercise flexibility.

Save the Date: Oct. 21 Open Access Week 2013 Kick Off Event at the World Bank and online: Redefining Impact

SPARC (The Scholarly Publishing and Academic Resources Coalition) and the World Bank have announced they will co-sponsor the kickoff event for Open Access Week 2013 on Monday October 21st in Washington, DC. The live event will take place at the state-of-the-art World Bank facilities and will host a Liveblog and Webcast for those who cannot attend in person. The event will also be recorded, and be available to the community for use during and after local Open Access events.

The event will begin at 3:00 p.m. EST and consist of a 90-minute panel discussion with Open Access experts from a variety of stakeholder groups as well as representatives from the World Bank and SPARC. As this year’s theme is “Open Access: Redefining Impact,” the panelists will discuss Article Level Metrics and changing the way scholarly communication is measured. Speakers will be announced in early September. 

During the event, the winners of the Accelerating Science Award Program (ASAP) will be announced. The ASAP Program, sponsored by 27 global organizations including Google, PLOS, and the Wellcome Trust, recognizes those who have built upon Open Access scientific research for new innovations shaping our society. For more information on the ASAP Program, visit

Click here for full announcement.

Making images Open can and should be routine

One of the many serious problems in re-use of scientific data is that it often occurs in diagrams. Here’s a simple example taken from (BMC is an Open Access, CC-BY publisher so there’s no problem with anything in this blog post …)

These are diagrams which simple record x-y data pairs and an estimate of the relation between them (lines and curves). A reader might very well want to extract the data and reanalyse them. Or simply post them, perhaps to applaud them or to criticize them. After all that’s science.

But images are copyright, aren’t they? A list of numbers is data, but an image? Might I get sued if I try to re-use the diagram without permission? Well, that’s what happened to Shelley Batts, when she did just that – she got a legal letter from Wiley ( ). After huge blogosphere reaction, Wiley retracted the threat – to Shelley – but they have never said they won’t do it to someone else.

And there’ good reason not to. “The publishers own the copyright”. And they can make money by reselling images. Many publishers will charge to include an image in a publication (over 50 USD – see Springergate ) where Springer claimed copyright on all images in their journals). The immediate effect of this is that authors can’t afford to re-use images and so science suffers drastically.

Note, of course, that the publisher normally makes ZERO contribution to the creation of the image. (They may reject it for technical commercial reasons – too “difficult”/ doesn’t fit their publishing workflow). But they “own” it.

Do scientists want this? If you are an author do you think:

” I want the publisher to make money from my images and prevent other people reusing them for legitimate purposes”.

If you do, stop reading…

But if you would like your images to be free for reuse, here’s a very simple thing. I built it in a 45-min train journey to hack4ac – it can be improved easily. You just run a 1-page Java program, that merges your image with a small icon indicating the image is free to re-use. It takes less than a second. It could be run as SoftwareAsAService (e.g. hosted in the cloud). Here are two examples of adding a simple tag (the first is deliberately large so you can see it).

Here the legend includes the authorship (this is trivial to customize)

The importance is that it is immediately clear that the image is free for re-use. (We could use CC-BY, but CC0 is probably more suitable). Note that nothing the publisher does, nothing you sign, takes away this right. The image carries its own permanent copyright.

And moreover it’s trivially obvious to all readers. It spreads the word.

It will save millions (literally) in saved time and effortless re-use.

A simple and effective way for it to be encouraged and implemented would be by all image-producing software (e.g. ImageJ, phylogenetic tree s/w) to offer this as default. If you WANT to give the publisher exclusive rights to resell and restrict your work you can switch the default off.

And, of course, some legacy publishers might even welcome it. (Stop fantasizing, PMR!)






Making images Open can and should be routine

One of the many serious problems in re-use of scientific data is that it often occurs in diagrams. Here’s a simple example taken from (BMC is an Open Access, CC-BY publisher so there’s no problem with anything in this blog post …)


These are diagrams which simple record x-y data pairs and an estimate of the relation between them (lines and curves). A reader might very well want to extract the data and reanalyse them. Or simply post them, perhaps to applaud them or to criticize them. After all that’s science.

But images are copyright, aren’t they? A list of numbers is data, but an image? Might I get sued if I try to re-use the diagram without permission? Well, that’s what happened to Shelley Batts, when she did just that – she got a legal letter from Wiley ( ). After huge blogosphere reaction, Wiley retracted the threat – to Shelley – but they have never said they won’t do it to someone else.

And there’ good reason not to. “The publishers own the copyright”. And they can make money by reselling images. Many publishers will charge to include an image in a publication (over 50 USD – see Springergate ) where Springer claimed copyright on all images in their journals). The immediate effect of this is that authors can’t afford to re-use images and so science suffers drastically.

Note, of course, that the publisher normally makes ZERO contribution to the creation of the image. (They may reject it for technical commercial reasons – too “difficult”/ doesn’t fit their publishing workflow). But they “own” it.

Do scientists want this? If you are an author do you think:

” I want the publisher to make money from my images and prevent other people reusing them for legitimate purposes”.

If you do, stop reading…

But if you would like your images to be free for reuse, here’s a very simple thing. I built it in a 45-min train journey to hack4ac – it can be improved easily. You just run a 1-page Java program, that merges your image with a small icon indicating the image is free to re-use. It takes less than a second. It could be run as SoftwareAsAService (e.g. hosted in the cloud). Here are two examples of adding a simple tag (the first is deliberately large so you can see it).

Here the legend includes the authorship (this is trivial to customize)

The importance is that it is immediately clear that the image is free for re-use. (We could use CC-BY, but CC0 is probably more suitable). Note that nothing the publisher does, nothing you sign, takes away this right. The image carries its own permanent copyright.

And moreover it’s trivially obvious to all readers. It spreads the word.

It will save millions (literally) in saved time and effortless re-use.

A simple and effective way for it to be encouraged and implemented would be by all image-producing software (e.g. ImageJ, phylogenetic tree s/w) to offer this as default. If you WANT to give the publisher exclusive rights to resell and restrict your work you can switch the default off.

And, of course, some legacy publishers might even welcome it. (Stop fantasizing, PMR!)






Hack4ac, Content-mining and open-science in Oxford

Jenny Molloy has invited me to introduce a session in Oxford this evening (Monthly meetings held at Oxford e-Research Centre, 7 Keble Road,
19:00-20:30 – See more at: and ) on Content Mining. It will be very informal – anyone can come and play.

The basis theme will be that:

  • Content mining is now routinely possible
  • YOU can do it
  • Your involvement will be a massive help

The idea is to build a community and collect/create community tools. There’s been a massive step forward with Jailbreaking the PDF ( ) and I’ve blogged some of this ( ). We also had a tremendous hackday in London ( ) (see Ross Mounce’s blog ). This was run by new-generation publishers (Ian Mulvany (PLoS), Jason Hoyt (PeerJ)) and looked at how hacking can change scholarly publishing. We came up with several ideas (I’ll blog my own soon) and Ross proposed figures2Data – what can we extract from the *figures* in the literature (not just the text).

We got a critical mass of 4-5 people and a great reservoir of knowledge and ideas. We made fantatstic progress. This is a very difficult subject and I assumed we wouldn’t manage much. However we found communal resources showing it can be done relatively simply using classical Computer Vision (Image analysis) such as Hough transforms and character recognition. I’d known that I’d have to hack the latter some time and was dreading it, but we found that Tesseract ( ) would provide a huge amount out-of-the-box. This is a great example of stepping back from the problem and letting in fresh light. I’m now pretty confident that we can manage to hack a wide range of scientific diagrams. Here’s a diagram

And here is what Thomas managed to extract (in an hour or two):


The OpenCV suite is tremendous and so powerful that navigation is a problem but Thomas has found all the key components. I am confident that Tesseract will recognize the characters and so we are well on the way to extracting all the information from this.

As readers know, Ross and I are working on phylogenetic trees (hypothesising the formation of species). The great thing about this subject is that anyone interested in science should be able to understand the basic concepts. It’s particularly useful for conservationists, biodiversity etc. So we are seeing what can be extracted from papers on this subject. I’m told there may be some people tonight specifically interested in this. This is a great area in which to start practical open-science.

The main problem, alas, is that most publishers have put in place legal restrictions to stop us doing this. There is no scientific reason for this. It’s to “protect their commercial interests”. So Jenny, Diane Cabell and I reviewed this for the Open Fellowship Academy p57: There we put forward a manifesto for content-mining under the mantra

The right to read is the right to mine.

Simply – if you have paid to be able to read science your machines should also have the right to read it.

But no, the publishers are fighting this and trying to licence the “privilege”. It’s being debated in Europe (Licences 4 Europe) and Ross presented a superb set of slides ( and this will give you a good indication of the technology and the restrictive practices.

So I’m hoping that tonight will see an expansion of our content-mining community!


A Bechdel test for scientific workshops

After attending two recent scientific conferences, one which was gender balanced, and one which was so gender-imbalanced that it engendered snarky out-of-band twitter comments, it struck me that we might need a Bechdel Test for scientific workshops.  The Bechdel test is a simple test for movies.  To pass the test, a movie has to have:

  1. at least two [named] women in it,
  2. who talk to each other,
  3. about something besides a man.

Seems simple, right?  You’d be amazed at just how few popular movies pass the test, including some set in universes that were originally designed for equality. (I’m talking about you, Star Trek reboot.)

Here’s an analogous test for scientific workshops or conference symposia.  Does the workshop have:

  1. at least two female invited speakers,
  2. who are asked questions by female audience members,
  3. about their research.

Again, this seems simple, right?  But you’d be shocked how few scientific conference symposia or workshops can live up to this standard.  I suspect this depends strongly on specific research fields. 

Rigoberto Hernandez has been talking about advancing science through diversity for quite a while.  I finally got to hear him speak about the OXIDE project on this latest trip, and he’s got a lot of great things to say about how diversity can strengthen science. I think one great way to help is to point out the good conferences we attend which live up to this standard.

Rigoberto also happened to be one of the organizers of the gender-balanced conference, which was also one of the best meetings I’ve ever attended.

Potential CHORUS catastrophe for OA: How to fend it off

Richard Poynder has elicited a splendid summary of OA by the person who has done more to bring about OA than anyone else on the planet: Peter Suber

Here are a few supplements that I know Peter will agree with:

1. Potential CHORUS Catastrophe for OA: Peter’s summary of OA setbacks mentions only Finch. Finch was indeed a fiasco, with the publishing lobby convincing the UK to mandate, pay for, and prefer Gold OA (including hybrid Gold OA), and to downgrade and ignore Green OA.

Peter notes the damage that the publisher lobby has successully inflicted on worldwide (but especially UK) OA progress with the Finch/RCUK policy, but I’m sure he will agree that if the Trojan Horse of CHORUS were to be accepted by the US federal government and its funding agencies, the damage would be even greater and longer lasting:

CHORUS is an attempt by the publishing lobby to take compliance with Green OA mandates out of the hands of the fundees whom OA mandates are designed to require to provide OA, and instead transfer control over the execution, the locus and the timetable for mandate compliance into the hands of publishers.

Adopting CHORUS would mean that President Obama’s OSTP directive — requiring that federally funded research must be made freely accessible online within 12 months of publication — would instead ensure that it was made freely accessible after 12 months, and not one minute earlier;.

And CHORUS would ensure also that the authors whom all Green OA mandates worldwide are designed to require to provide OA — because they want OA yet dare not provide it without a mandate from their institutions or funders, for fear of their publishers — would no longer be affected by any mandate:

With CHORUS, publishers would have succeeded in locking in 12-month-embargoed Delayed Access instead of immediate Green OA for years to come, in the US and, inevitably, also worldwide.

So, as I am sure Peter will agree, CHORUS must be rejected at all costs, just as the previous Trojan Horses of the publishing lobby — PRISM and the Research Works Act — were rejected. It’s bad enough that Finch slipped through.

2. Hybrid Gold OA has a few additional negative features, apart from the ones Peter already mentions:

Even if the publiser gives subscribing institutions a rebate to offset double-dipping, Hybrid Gold locks in current total publisher revenue — from institutional subscription fees plus author hybrid Gold OA fees — come what may. Hybrid Gold immunizes publishers from any pressure to cut costs by phasing out obsolete products and services in the online era.

Only globally mandated Green OA self-archiving in repositories by authors can force publishers to downsize to the post-Green essentials alone.

And if a hybrid Gold journal also imposes an embargo on Green, that is tantamount to legally sanctioned extortion, even without double-dipping: “If you want to provide immediate OA, you must pay me even more than I am already being paid by your institution for subscriptions — and your institution only gets back a tiny fraction of the rebate from your surcharge.”

(This is also the option to which CHORUS, in tandem with Finch, would hold immediate OA hostage for many years more. Since immediate OA is optimal for research, hence inevitable, publishers, if funders take in their Trojan Horse, will have succeeded in delaying OA for as long as they possibly could, in defence of their current revenue streams. This is also the publishers’ self-serving scenario in which COPE institutions would unwittingly collude, if they funded Gold OA without first mandating immediate-deposit Green OA.)

3. Pre-Green Fools Gold vs. Post-Green Fair Gold: The only thing that can bring the cost of peer-reviewed journal publishing down to a fair, affordable, sustainable price is globally mandated Green OA. Only Green OA will allow institutions to cancel their journal subscriptions, thereby forcing journals to adapt naturally to the online era by cutting obsolete costs, downsizing and converting to Fair Gold. Once Green OA mandates fill them, it is the global network of Green OA repositories that will allow publishers to phase out all the products and services associated with access-provision and archiving. CHORUS and Finch are designed to allow publishers to keep co-bundling (and charging) for their obsolete products and services as long as possible.

Stevan Harnad

The Poop Will Tell Us: Do Elephants and Rhinos Compete for Food?


A recent study of the two animals in Addo Elephant National Park, called “Shift in Black Rhinoceros Diet in the Presence of Elephant: Evidence for Competition?” suggests the answer is yes.

Scientists interested in helping endangered species like the African elephant and the black rhinoceros would like to know whether these animals compete for resources in the wild, as such food contests could impact the population and health of both species. Unfortunately, our favorite rough-skinned big guys have IUCN statuses of vulnerable and critically endangered, respectively, so competition for food between them may present a bit of an ecological puzzle.

To gain evidence of food competition, researchers from Australia and the Centre for African Conservation Ecology took a close look at elephant and rhino poop (no, seriously) across different seasons to identify the types of plants each herbivore  was eating. Poop collecting was performed at times of the year when rhinos and elephants ate in the same region, and then again when only rhinos grazed in the area (in the absence of elephants). Variations in the plant types found in the feces were counted as indicators of dietary differences.

While it’s been shown that the presence of elephants can help some herbivores with habitat and food access, limited studies have been conducted on how the elephants’ foraging behavior may affect that of specifically megaherbivores. The authors state that there is clear evidence that elephants hog and monopolize food, a behavior that they suspected would affect the diets of other large herbivores. Indeed, the results of this study revealed that resource use was clearly separated by season, and rhinos munched on different grasses depending on whether or not the elephants were present. Without elephants around, rhinos ate more diverse plants, like woody shrubs and succulents, but in their presence, rhinos restrained themselves and consumed more grasses. This may not seem like a big deal, but rhinos are known to be strict browsers (read: picky about their food choices), so this dietary difference discovery was surprising to researchers.


The authors go on to suggest that elephants living at high population densities in certain regions may significantly affect the foraging opportunities of other grazers, and these close living quarters may have long-term effects on the overall fitness of the other animals. These behaviors may have particularly important consequences in smaller or fenced-in wildlife parks, where populations tend to grow at the same time that food availability goes down.

Citation: Landman M, Schoeman DS, Kerley GIH (2013) Shift in Black Rhinoceros Diet in the Presence of Elephant: Evidence for Competition? PLoS ONE 8(7): e69771. doi:10.1371/journal.pone.0069771

Image Credits: African wildlife photo by Chris Eason (Mister E); plot from article