OASIS Topics

Testing Jan Velterop’s Hunch About Green and Gold Open Access

Mandated and Unmandated Open Access:
Comparing Green and Gold

Yassine Gargouri
Stevan Harnad

Cognition/Communication Laboratory
Cognitive Sciences Institute
Universitè du Québec à Montréal

SUMMARY: Velterop (2010) has conjectured that more articles are being made Open Access (OA) by publishing them in an OA journal (“Gold OA”) than by publishing them in a conventional journal and self-archiving them (“Green OA”), even where self-archiving is mandatory. Of our sample of 11,801 articles published 2002-2008 by authors at four institutions that mandate self-archiving, 65.6% were self-archived, as required (63.2% Green only, 2.4% both Green and Gold). For 42,395 keyword-matched, non-mandated control articles, the percentage OA was 21.9% Green and 1.5% Gold. Velterop?s conjecture is the wrongest of all precisely where OA is mandated.

Jan Velterop has posted his hunch that of the overall percentage of articles published annually today most will prove to be Gold OA journal articles, once one separates from the articles that are classified as self-archived Green OA those of them that also happen to be published in Gold OA journals:

    JV: ?Is anyone? aware of credible research that shows how many articles (in the last 5 years, say), outside physics and the Arxiv preprint servers, have been made available with OA exclusively via ‘green’ archiving in repositories, and how many were made available with OA directly (‘gold’) by the publishers (author-side paid or not)?
    ?The ‘gold’ OA ones may of course also be available in repositories, but shouldn’t be counted for this purpose, as their OA status is not due to them being ‘green’ OA.
   ?It is my hunch (to be verified or falsified) that publishers (the ‘gold’ road) have actually done more to bring OA about than repositories, even where mandated (the ‘green’ road).

— J. Velterop, American Scientist Open Access Forum, 25 August 2010

The results turn out to go strongly contrary to Velterop?s hunch.

Our ongoing project is comparing citation counts for mandated Green OA articles with those for non-mandated Green OA articles, all published in journals indexed by the Thompson/Reuters ISI database (science and social-science/humanities). (We use only the ISI-indexed sample because the citation counts for our comparisons between OA and non-OA are all derived from ISI.)

The four mandated institutions were Southampton University (ECS), Minho, Queensland University of Technology and CERN.

Out of our total set of 11,801 mandated, self-archived OA articles, we first set aside all those (279) articles that had been published in Gold OA journals (i.e., the journals in the DOAJ-indexed subset of ISI-indexed journals) because we were primarily interested in testing the OA citation advantage, which is based on comparing the citation counts of OA articles versus non-OA articles published in the same journal and year. (This can only be done in non-OA journals, because OA journals have no non-OA articles.) This left only the Green OA articles published in non-Gold journals.

We then extracted, as control articles for each article in this purely Green OA subset, 10 keyword-matched articles published in the same journal and year. The total number of articles in this control sample for the years 2002-2008 was 41,755. (Our preprint for PloS, Gargouri et al. 2010, covers a somewhat smaller, earlier period: 2002-2006, with 20,982 control articles.)

Next we used a robot to check what percentage of these unmandated control articles was OA (freely accessible on the web).

Of our total set of 11,801 mandated, self-archived articles, 279 articles (2.4%) had been published in the 63 Gold OA journals (2.6%) among the 2,391 ISI-indexed journals in which the authors from our four mandated institutions had published in 2002-2008. Both these estimates of percent Gold OA are about half as big as the total 5% proportion for Gold OA journals among all ISI-indexed journals (active in the past 10 years). To be conservative, we can use the higher figure of 5% as a first estimate of the Gold OA contribution to total OA among all ISI-indexed journals.

Now, in our sample, we find that out of the total number of articles published in ISI-indexed journals by authors from our four mandated institutions between 2002-2008 (11,801 articles), about 65.6% of them (7,736 articles) had indeed been made Green OA through self-archiving by their authors, as mandated (7,457 or 63.2% Green only, and 279 or 2.4% both Green and Gold).

In contrast, for our 42,395 keyword-matched, non-mandated control articles, the percentage OA was 23.4% (21.9% Green and 1.5% Gold).

Björk et al?s (2010) corresponding finding [Table 3] for their ISI sample (1282 articles for 2008 alone, calculated in 2009), was 20.6% total OA (14% Green plus 6.6% Gold). (For an extended sample that also included non-ISI journals it was 11.9% Green plus 8.5% Gold.)

The variance is probably due to different discipline blends in the samples (see Björk et al’s Figure 4, where Gold exceeds Green in bio-medicine), but whichever overall results one chooses ? whether our 21.9% Green and 1.5% Gold or Björk et al?s 14% Gold and 6.6% Green (or even their extended 11.9% Green and 8.5% Gold), the figures fail to bear out Velterop?s hunch that:

?publishers (the ‘gold’ road) have actually done more to bring OA about than repositories, even where mandated (the ‘green’ road).?

Moreover (and this is really the most important point of all), Velterop’s hunch is the wrongest of all precisely where OA is mandated, for there the percent Green is over 60%, and headed toward 100%. That is the real power of Green OA mandates.


Gargouri, Y., Hajjem, C., Lariviere, V., Gingras, Y., Brody, T., Carr, L. and Harnad, S. (2010) Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research. PLOS ONE 10(5)

Björk B-C, Welling P, Laakso M, Majlender P, Hedlund T, et al. (2010) Open Access to the Scientific Journal Literature: Situation 2009. PLOS ONE 5(6): e11273.

Of downloads and lemmings

Following is my contribution to a recent discussion on the American Scientist Open Access Forum on the topic of downloads:

What the institutional repository does with respect to download counts, is to direct traffic to the university’s site. This will increase the university’s web presence and hence emerging institutional web impact assessments, likely to be of increasing importance in years to come.

Having said that, I would also like to point out that there are serious dangers to scholarship that come with over-emphasis on the numbers. What is popular from a scholarly perspective at one point in time is not necessarily what is important.

One way to think of this: imagine that we humans are like a group of lemmings rushing madly towards a cliff (given the climate crisis and our limited attention to this, I would argue that this is a reasonable comparison). Any lemming that says (or writes) about – how to get to the cliff even faster – is likely to be well-heeded (and cited, if it is an academic lemming). On the other hand, the scholar who looks ahead and sees the cliff and shouts off (or writes up for a peer-reviewed lemming paper): “Hey! Cliff ahead! Should be change direction? ” may not get much attention immediately. (Later on, after the early birds have gone over the cliff, could be a different story).

Another important point: one real danger of usage statistics is the potential for usage-based pricing. If we go this route, it is just a matter of time before some of us impose limits on reading. If the undergraduate research project becomes a cost item, there will be a strong incentive to limit research and/or eliminate research projects. When one copy of an article can easily serve anyone, anywhere, it would be a shame to go this route. (Thanks to Andrew Odlyzko for pointing to this danger).

For a more in-depth look at this topic, please see my book chapter, “The implications of usage statistics as an economic factor in scholarly communications”, OA copy in the SFU IR at:

Comment: if there is one single possibility that tipped the balance for me to become an open access advocate, it is the spectre of usage-based pricing and the evil that this entails, at least for scholarship.

Wanted: a don’t-be-evil online music store

Update August 23: http://www.emusic.com/ has been suggested. Interesting – DRM-free music, but on the other hand an ongoing monthly financial commitment, which I don’t like, with emusic reserving the right to change the terms and conditions at any time. Not sure about this, but definitely much better than itunes.

Much as I like the idea of the convenience of buying music online (whether for download or CDs), I can’t bring myself to sign the itunes agreement. This agreement says that I can only use what I buy in Canada, and apple reserves the right to use spy technology to enforce this. Nor am I keen on purchasing through Amazon, since I don’t see an easy way to filter out music produced by companies that employ evil sue-their-customers-even-kids techniques and/or who advocate for evil copyright protection. Perhaps an Amazon “no-RIAA members” button would suffice?

Given the popularity of the international Pirate Parties, maybe it’s not just me?

On the other hand, maybe I should just create my own music.