arXiv business model development

arXiv, the physics preprint server, is in the process of moving to a new form of business model, one based on voluntary contribution based on usage. Following is a suggestion for a slight tweak that I think will make the transition smoother; but don’t wait, please sign up right away as the financial aspects likely need no change.

The cost of maintaining arXiv, including further developing the service to fully meet the needs of the physics community, is less than $7 per submission or 1.3 cents per download. The current approach is to ask heavy users (readers) of the service to contribute to the funding of arXiv, and 22 institutions have made this commitment so far.

My suggestion, one which would not be that different financially but might be a model that is easier to sustain into the future, is to follow a similar institution-based approach, but base the charges on approximate submissions rather than usage. Regular contributors could pay a flat fee, perhaps calculated to reflect an average of about $10 per submission (to allow for subsidy for authors from developing countries and/or to build an endowment for arXiv’s future). Institutions that choose not to contribute on a regular basis could then pay one-off fees reflecting the higher administration cost associated with the one-off payment system, e.g. perhaps about $50 per submission.

To transition from the current plan, likely all that is needed is a little wordsmithing in the current document, i.e. change: “The top 100 institutions based on the previous year’s download activity” to “The top 100 institutions bases on the previous year’s submission activity”. The list of institutions in each category are likely about the same, as it is the active physics researchers who are more likely to be doing the most reading and the most contributing.

This approach is more likely to succeed in the longer term, as this way the institution is clearly purchasing a needed service; and, if necessary, the cost of paying per submission is easily within the reach of the individual author, if necessary.

Noting a bit of inspiration from Joe Esposito on Scholarly Kitchen.

Note also that $7 per submission appears to be covering preservation costs rather than this year’s submissions. Clarification on this point from arXiv would be appreciated.

On Open Access: “Gratis” and “Libre”

Matthew Cockerill [MC] (BioMedCentral) wrote:

MC: “Agreement on terminology can really only ever be pragmatic”


MC: “Many of us use “open access” to mean what Stevan refers to as ‘libre open access’, and have distinguished this from “free access” which Stevan refers to as ‘Gratis open access’.”

This is alas all true too.

It is also true that “many of us” (not me!) use “open access” to mean “gold open access” (publishing) only.

And the progress of open access is likewise much the worse off — pragmatically– because of this other widespread conflation (sometimes willful, mostly just ignorant) too.

[(1) Self-Archiving FAQ #31 “Waiting for Gold” (2) Stevan Harnad, “Opening Access by Overcoming Zeno’s Paralysis,” #15 and (3) Peter Suber “Field Guide to misunderstandings about open access” Misunderstanding #1: “All OA is gold OA” and its flip-side: Misunderstanding #22 “All OA is gratis OA.” “The Budapest, Bethesda, and Berlin definitions of OA all describe forms of libre OA. However, there are good reasons to recognize gratis OA as a kind of OA… The current misunderstanding accepts that gratis OA is a kind of OA, but goes one step too far and assumes that gratis OA is the only kind of OA…” ]

It is also true that what Stevan (and Peter, let’s not forget) — co-coiners of the original (nonbinding, nonlegal) BOAI definition of “open access” — refer to as “libre open access” was coined specifically to distinguish it from “gratis open access,” which means free online access (whereas libre OA means free online access plus some re-use rights, not all yet specified).

But from the very outset, there has been some (understandable) motivation on the part of gold open access publishers to co-opt the term “open access” to fit their product, and only their product. See the long, sad, “Free Access vs. Open Access” debate, started by BioMedCental’s first editorial “Free Access is not Open Access” in “Open Access Now” on 28 July 2003).

What is one to say, except that some of it sounds a lot like a battle over a trademark — which you need, if you are conducting a trade…

But not just a battle over trademark. Also ideology vs. pragmatics. (I don’t, by the way, think Matt’s motivation, in particular, is primarily commercial: I am certain that he believes, very sincerely, in (libre) OA.)

My own motivation is exclusively to get all of the refereed literature freely accessible online, at long last, as soon as possible (it’s already more than a decade and a half overdue), in whatever way works, is within reach, works surely, and works fast.

Hence the only thing at stake for me when it comes to the trademark “OA” is the fate of free online access itself, which will certainly come much later if — now that the term “OA” and the “OA Movement” are launched in public consciousness — it is now declared, for either commercial or ideological reasons, that OA mandates are no longer OA mandates but “FA” mandates, the OA impact advantage is no longer the OA advantage but the FA advantage, and those who have been fighting for OA since long before it got a name have not, in fact, been fighting for OA but “FA.” Moreover, it means that precious little of the (already precious little) OA we have to date (about 15% green plus about 15% gold) is in reality OA at all: It’s just “FA.”

I find all this doubly foolish, not only because (1) gratis OA (free online access) is a necessary condition, though not a sufficient condition, for libre OA (free online access plus some re-use rights, not all yet specified) and will (as is evident to anyone who gives it a few minutes of serious thought) almost certainly lead to libre OA soon after it becomes universal (if and when we do what we need to do to make gratis OA universal) but also because (2) over-reaching and insisting on libre OA first, and deprecating gratis OA as not really being OA at all, merely FA, is merely serving to delay the onset of libre OA too (just as insisting that only Gold OA publishing is OA is delaying the era of Gold OA publishing).

So, yes, as Matt says, use of the terminology is just a matter of pragmatics, but not linguistic pragmatics: strategic pragmatics. And needlessly, counterproductively over-reaching for libre OA (or Gold OA) now, when Green gratis OA is fully within our grasp is just about as unpragmatic and short-sighted as one can possibly be, in the short (but already far too long) history of OA. And the attempt to co-opt the term exclusively is simply making the “best” the enemy of the better.

(I can already sense that there are those who are straining to chime in that their insistence on libre OA, too, is driven neither by commercial considerations nor ideology but pragmatics: they need the re-use rights, now, and their research progress is hurting for the lack of them. Let me suggest that if you look more closely at this “pragmatic” case for libre OA it almost always turns out to be about open data, not OA (which is about journal articles). Yet those who are in a hurry for open data are apparently happy to conflate their case with OA’s, even if it’s at the expense of again gratuitously handicapping our reach — for the green gratis OA to journal articles that is within our grasp — with the independent extra burden of data re-use rights. And what is invariably forgotten in all this special-case over-reaching is the completely correctable general case that has been staring us in the face, uncorrected, lo these 15+ years, which is that every day countless would-be users are being denied access and usage for the 85% of journal articles that are accessible only to those with subscription access. That is the paramount problem that the online era has empowered us to solve, and instead we are fussing about extra perks that will surely come soon after we solve it, but not if we continue to make those extra perks a precondition for a solution — or even for naming the problem!)

MC: “I believe the reason that many, including BioMed Central, reserve the term open access for the ‘libre’ sense is not simply the historical precedent of BOAI and Bethesda, but also the wider related usage of the term open (as in open source, open courseware, open wetware, open government). In all cases, these imply the availability, reusability and redistributability of the material, not the fact that it doesn’t cost anything.”

And in all cases, as soon as one takes the trouble of looking closely at the apparent similarities, the profound differences reveal that this conflation of senses is specious and superficial: article texts are not program code that needs to be re-used and re-written; article texts are to be read and then the ideas and findings in them are to be re-used in new research and writings. Same for the disanalogy with open data, which of course includes “open wetware.” Inasmuch as open courseware is just text, free online access for all is all that’s needed. (Put the URL in the coursepack instead of the text.) Inasmuch as courseware is programs, it’s the same disanalogy between text code and software code. Ditto for “open multimedia” and rip/remix/mashup: not for scholarly/scientific text — though fine for the scholarly/scientific ideas and findings described in the text (modulo plagiarism). And “open government” is about combatting secrecy, which is moot for published scientific research (whether or not access carries a price tag).

On the Deep Disanalogy Between Text and Software and Between Text and Data Insofar as Free/Open Access is Concerned

Making Ends Meet in the Creative Commons

In other words, I don’t know about Peter, but it’s certainly true that for my own part it was not because of all of these superficial and in the end specious commonalities supposedly shared by this panoply of “open” X’s that I favored the term “open access” as the descriptor for what the online era had made possible for refereed scholarly/scientific journal articles.

On the contrary. If I had known in 2002 what confusion and conflation it would make “OA” heir to, I would have avoided the term “open” like the plague. (There was one commonality, though, that both Peter and I did intentionally try to capitalize on in our choice of that term: the “open” in the “open archives initiative” protocol for metadata harvesting. That harks back to an even earlier decision point, this time in an email exchange with Herb van de Sompel in 1999 about what how to rename the “Universal Preprint Service” and its “Santa Fe Convention,” which had been the original names for the OAI and OAI protocol. It was Herb who opted for “open” rather than “free” (which I seem to recall that I preferred), so OAI became OAI, and OA/BOAI followed soon afterward (though OAI’s “archive” was soon jettisoned — again for no good reason whatsoever, just arbitrariness and pedantry — in favor of”repository”… Lexicalization is notoriously capricious, and unintended metaphors and other affinities can come back to haunt you…)

MC: “On which basis, one might refer to Gratis open access, as being ‘non-open open access’.  Which is why it seems to me a problematic form of terminology, however well-intentioned.”

On the contrary, Matt. You are being so seduced by your incoming biases here that you don’t realize that you are making them into self-fulfilling prophecies: Gratis OA is only “non-OA OA” to those who wish to argue that free online access is not open access!

Let me close with an abstract of the keynote I will be giving at the e-Democracy Conference in Austria in May.

In that talk I also will be discussing the commonalities and differences among the various “open” movements, but note only that “The problem [of Green Gratis OA] is not particularly an instance of “eDemocracy” one way or the other…”:


ABSTRACT: The primary target of the worldwide Open Access initiative is the 2.5 million articles published every year in the planet’s 25,000 peer-reviewed research journals in all scholarly and scientific fields. Without exception, every one of these articles is an author give-away, written solely to be used, applied and built upon by other researchers, not for royalty income. The optimal and inevitable solution for this give-away research is that it should be made freely accessible to all its would-be users online and not only to those whose institutions can afford subscription access to the journal in which it happens to be published. Yet this optimal and inevitable solution, already within reach of the global research community for at least two decades now, has been taking a remarkably long time to be grasped. The problem is not particularly an instance of “eDemocracy” one way or the other; it is an instance of inaction because of widespread misconceptions (reminiscent of Zeno’s Paradox). The solution is for the world’s research institutions and funders to extend their existing “publish or perish” mandates to require their employees and fundees to maximize access and impact for the research they are employed and funded to conduct by depositing it in their Open Access Institutional Repositories immediately upon acceptance for publication to make it freely accessible to all its potential users webwide. Open Access metrics can then be used to measure and reward research progress and impact.

Stevan Harnad
American Scientist Open Access Forum