What We Talk About When We Talk About… Book Usage Data

“Over the last two-and-a-half years, we have been working as part of the EU-funded HIRMEOS (High Integration of Research Monographs in the European Open Science Infrastructure) project to create open source software and databases to collectively gather and host usage data from various platforms for multiple publishers. As part of this work, we have been thinking deeply about what the data we collect actually means. Open Access books are read on, and downloaded from, many different platforms – this availability is one of the benefits of making work available Open Access, after all – but each platform has a different way of counting up the number of times a book has been viewed or downloaded.

Some platforms count a group of visits made to a book by the same user within a continuous time frame (known as a session) as one ‘view’ – we measure usage in this way ourselves on our own website – but the length of a session might vary from platform to platform. For example, on our website we use Google Analytics, according to which one session (or ‘view’) lasts until there is thirty minutes of inactivity. But platforms that use COUNTER-compliant figures (the standard that libraries prefer) have a much shorter time-frame for a single session – and such a platform would record more ‘views’ than a platform that uses Google Analytics, even if it was measuring the exact same pattern of use.[2]

Other platforms simply count each time a book is accessed (known as a visit) as one ‘view’. There might be multiple visits by the same user within a short time frame – which our site would count as one session, or one ‘view’ – but which a platform counting visits rather than sessions would record as multiple ‘views’.

Downloads (which we also used to include in the number of ‘views’) also present problems. For example, many sites only allow chapter downloads (e.g. JSTOR), others only whole book downloads (e.g. OAPEN), and some allow both (e.g. our own website). How do you combine these different types of data? Somebody who wants to read the whole book would need only one download from OAPEN, but as many downloads as there are chapters from JSTOR – thus inflating the number of downloads for a book that has many chapters.

So aggregating this data into a single figure for ‘views’ isn’t only comparing apples with oranges – it’s mixing apples, oranges, grapes, kiwi fruit and pears. It’s a fruit salad….”