AI in Academic Libraries, Part 1: Areas of Activity, Big Players and the Automation of Indexing

Interview with Frank Seeliger (TH Wildau) and Anna Kasprzik (ZBW)

We recently had an intense discussion with Anna Kasprzik (ZBW) and Frank Seeliger (Technical University of Applied Sciences Wildau) on the use of artificial intelligence (AI) in academic libraries. Both of them were also recently involved in two wide-ranging articles: “On the Promising Use of AI in Libraries: Discussion Stage of a White Paper in Progress – Part 1“ (German) and “Part 2 (German). This slightly shortened, three-part series has been drawn up from our spoken interview. These two articles are also part of the following text:

  • Part 2: Interesting Projects, the Future of Chatbots and Discrimination Through AI
  • Part 3: Prerequisites and Conditions for Successful Use

We will link them here as soon as the texts are online.

An interview with Dr Anna Kasprzik (ZBW – Leibniz Information Centre for Economics) and Dr Frank Seeliger (University Library of the Technical University of Applied Sciences Wildau).

What are the most promising areas of activity for the use of AI in academic libraries?

Frank Seeliger: Time and again, reports crop up about how great the automation potential of different job profiles is. This also applies to libraries: In the case of the management of an institution, automation using AI is minimal, but for the specialists for media and information services (FaMI in German), it could be up to 50%.

In the course of automation and digitalisation, it’s largely about changing process chains and automating so that users can borrow or return media autonomously in the libraries – outside opening hours or during rush hour – essentially as an interaction between human and machine.

Even the display of availabilities in the catalogue is a consequence of the use of automation and digitalisation of services in libraries. Users can check at home whether a medium is available. Services in this area – those dealing with how to access a service outside the immediate vicinity and opening hours – are certainly increasing, for example in the context of asking a question or using something during the evening, including via remote access. This process continues and also includes internal procedures such as leave requests or budget planning. These processes run completely differently in comparison to 15 years ago.

One of the first areas of activity for libraries is the automatic letter and number recognition, including for older works, cimelia, early printed books or also generally in the context of digitalisation for all the projects there. This is the one area of expertise of libraries in layout, identification and recognition. The other is the question of indexing. Many years ago, libraries worked almost exclusively with printed works, keywording them and indexing their content. Nowadays detection systems have tables of contents and work with what are known as “component parts of a bibliographically independent work”, i.e. articles that are co-documented in discovery tools or search engines. The question is always: “How should we prepare this knowledge so that it can be found using completely different approaches?” Competitors such as Wikipedia and Google predetermine the speed to some extent. We try to keep up or go into niche fields where we have different expertise, another perspective. These are definitely the first areas of activity in the field of operations, search activities or indexing and digitalisation, where AI is helping us to go further than before.

It has thereby been possible for many libraries to offer services at lower personnel cost even beyond the opening hours of public libraries (Open Level concept). Not round the clock, but for several more hours – even if no-one is in the building.

We need to make sure that we provide students with relatively high-quality information at different places and different times in their various locations. This is why chatbots for example (there’s more to come about this in part 2 of this article series) are such an exciting development, because students do not necessarily work when libraries are open or when our service times are available, but rather during the evenings, at the weekend or on public holidays. Libraries have the urgent task of providing them with sufficient and quality-checked information. We need to position ourselves where the modern technologies are.

Anna Kasprzik: Perhaps I’m biased because I’m working in the field but for me it’s very important to differentiate: I am specialised in the field of automation of subject indexing in academic libraries; the core task is to process and provide information intelligently. For me, this is the most interesting field. However, I sometimes get the impression that some libraries are falling into a trap: they want to do “something with AI” because it’s cool at the moment and then just end up dabbling in it.

But it’s really important to tackle the core tasks and thus prove that libraries can stay relevant. These days, core tasks such as subject indexing are impossible to imagine without automation. Previously this work was done intellectually by people, often even by people with doctorates. But because the tasks are changing and the quantity of digital publications is growing so rapidly, humans can only achieve a fraction of what is required. This is why we need to automate and successively find ways to combine humans and machines more intelligently. In machine learning, we speak of the „Human in the Loop“. By this, we mean the various ways in which humans and machines can work together to solve problems. We really need to focus on the core tasks. And we need to apply methods of artificial intelligence and not just do explorative projects that might be interesting in the short-term but are not thought through at a sustainable level.

Frank Seeliger: The challenge is that, even when you have a very narrow field that you are trying to research and describe, it’s difficult to stay up to date with all relevant articles. You need tools such as the Open Research Knowledge Graph (ORKG). With its help, content can be compared with the same methods and similar facts, without reading the entire article. Because this naturally requires time and energy. It’s impossible to read 20 scientific articles a day. But that’s how many are produced in some fields. That’s why you need to develop intelligent tools that help scientists to get a fast overview of which articles to prioritise for reading, absorbing and reflecting on.

But it goes even further. In the authors’ group of the „White Papers in progress“ (German), which we held for one year, we asked ourselves what search of the future would be like: Will we still search for keywords? We’re familiar with this from plagiarism detection software into which entire documents are entered. The software checks whether there is a match with other publications and whether non-cited text is used without permission. But you can also turn the whole thing around by saying: I have written something; have I forgotten a significant, current contribution in science? As a result, you get a semantic ontological hint that there is already an article on the topic you have explored which you should reflect on and incorporate. This is a perspective for us, because we assume that today one can hardly become master of the situation, even when they have an interdisciplinary focus or are exploring a new field. It would also be exciting to find a way in via a graphic analysis that ensures that you have not forgotten anything important.

(How) can libraries keep up with big players such as Google, Amazon or facebook? Do they even have to?

Frank Seeliger: We’ve had some very intensive disagreements about this and come to the conclusion that libraries will never have the men-and-women power that other corporations have, even if we were able to only have one single world library. Even then it would be questionable whether we would be able to establish a parallel world (and if we would even want this). After all, others cater for other target groups. But even in the case of Google Scholar, the target group is quite clearly defined.

Our expertise lies in the respective field that we have licenced, for which we have access. Every higher education institution has different points of focus for its own teaching and research. For this, it ensures very privileged, exclusive access which is used to reflect precisely on what is in the full text or is licenced and what can be accessed by going to the shelves. This is and remains the task.

Although it is also changing. How will things develop, for example, if a very high percentage of publications are published in Open Access and the data becomes freely accessible? There are semantic search engines that are experimenting with this. Examples are YEWNO at Bayerische Staatsbibliothek (Bavarian State Library) or, a company that has a headquarters in Prague, among other places. They work a lot with Open Access literature and try to process it differently on a scientific level than before. So in this respect, tasks also change.

Libraries need to reposition themselves if they want to stay in the race. But it’s clear that our core task is first of all to process the material that we have licenced and for which we pay a lot of money in the best possible way. The aim must be that our users, i.e. students or researchers, find the information they need relatively quickly and not after the 30th hit.

One of the ways in which libraries are intrinsically different to the big players lies in how they deal with personal data. The relationship to personal data when using services is diametrically opposed to the offers of the big players, because values such as trustworthiness, transparency etc. play an enormously important role for the services of libraries.

Do students even start their search in library catalogues? Don’t they go directly to the general internet search engines?

Anna Kasprzik: They use Google relatively often. At the ZBW, we are actually currently analysing the routes via which users enter our research portal. It’s often Google hits. But I don’t see that as a problem because the research portal of a library is only one reuse scenario of metadata that libraries create. You can also make it available for reuse as Linked Open Data. And what’s more: Google uses a lot of this data, so it is already integrated into Google.

And to respond to the other question, we have also discussed this in the paper, at least in the early draft. The fact that libraries are publicly funded means that they have a very different set of ethics when dealing with the personal data of users. And this has many advantages because they don’t constantly try to milk the users according to their needs or requirements. Libraries simply want to provide the best-prepared information possible. This is a strong moral advantage, which we can utilise to our benefit. But libraries do not sell this advantage, at least not very much.

There is also an age-old disagreement about this (which has nothing to do with AI, however) – many students or also PhD candidates do not realise that in their everyday lives, they are using data that a library has prepared and made available for them. They call up a paper in the university and do not notice that its link has been made available via their library and that the library has paid for this. And then, there are two factions: some people say that the users shouldn’t notice that it must occur as smoothly as possible. The others believe that, actually, there should be a big fat notice stating “provided by your library” so that people can’t miss it.

Frank Seeliger: The visualisation of the library work that is reused by third parties is a great challenge and must be properly championed because otherwise, if it is no longer visible, people will start asking why they are giving money to libraries at all? The results are visible but not who has financed them and/or people don’t notice that they are actually commercial products.

Another aspect that we discussed was the issue of transparency and freedom from advertising. We organised a virtual Open Access Week (German) from November 2021 to March 2022. We made video recordings of each ninety-minute session. Then we asked ourselves: Should we use YouTube for publication or the non-commercial video portal of the TIB Leibniz Information Centre for Science and Technology and University Library (TIB AV Portal)? We made a clear-cut decision to use the TIB AV portal and they have accepted us there. We decided in favour of the portal precisely because there are no advertisements, no overlays and no pop-up windows. If we work with discovery tools, we try to advertise the fact that you really don’t get any advertising and reach your goal with your very first hit. Therefore, several aspects differentiate us significantly from commercial providers. We are having that discussion right now; it’s an important difference.

Will the intellectual creation of metadata soon become superfluous because intelligent search engines will take over this task?

Anna Kasprzik: This is a fundamental issue for me. I say: “no”, or perhaps “yes and no”. What we are doing at the moment via our automation of subject indexing with machine learning methods is an attempt to imitate the intellectual subject indexing one-to-one, just the same way it has always been done. But for me this is only a way for us to get our foot in the door technologically. In the next few years, we will address this and start designing the interplay between human knowledge organisation expertise and machines in a more intelligent way – reorganise it completely. I can imagine that we will not necessarily need to do the intellectual subject indexing in advance in the same way that we are currently doing it. Instead, intelligent search engines can try to index content resources taking the context into account.

But even if they are able to do this from the context ad hoc, those engines require a certain amount of underlying semantic structuring. And this structuring needs to exist in advance. It will therefore always be necessary to prepare information so that the pattern recognition algorithms can access them in the first place. If you merely dive into the raw data, the result is chaos, because the available metadata is fuzzy. You need structuring that pulls the whole thing more sharply into focus, even if it only accommodates the machine to a partial extent and not completely. There exist completely different ways of interconnecting search queries and retrieval results. But intelligent search engines still have to have something up their sleeve, and that something is organised knowledge. This knowledge organisation requires human expertise as input at certain points. The question is: at which points?

Frank Seeliger: There is also the opposing view of TIB director Prof. Dr Sören Auer, who says that data collection is overvalued. Certainly also meant as a provocation or simply to test how far one can go. In the future, it may not be necessary to have as many colleagues working in the field of intellectual indexing.

For example, we have 16,000 graduate thesis held in the library of the TH Wildau library; the entire lists of contents are being scanned and made OCR-compatible. The question is, can you systematise them according to the Regensburger Verbundklassifikation (RVK, Regensburger Association Classification; a classification scheme for academic libraries), perhaps with the Annif tool? This means that I don’t have to look at each dissertation and say, this one belongs in the field of engineering, etc., independently of the study courses in which they were written. But instead, here is the RVK graph, there are the tables of contents, then they are matched according to certain algorithms. This is a different approach to when I, as a specialist, take a look at every work and index it correspondingly for keywords, the Integrated Authority File (GND; a service facilitating the collaborative use and administration of authority data) and so on, run through all the procedures. I see this as a new way of master or mistress of the masses, because a great deal is published; because we have taken over responsibilities that did not used to be covered by libraries, such as the indexing of articles, i.e. component parts of a bibliographically independent work, besides bibliographically independent works. It’s definitely a great help.

However I cannot imagine that humans no longer intervene at all in such algorithms and offer a pre-structuring according to which they must act. Up to now, it’s been the case that we require a lot of human intervention to trim and optimise these systems better, so that the results are indexed 99% correctly. That’s one objective. This requires control and pre-structuring, looking at, training data. For example in calligraphy, when you check if a letter has been recognised correctly. Checking and handling by human beings is still necessary.

Anna Kasprzik: Exactly – I mentioned the concept earlier: the “human in the loop”, i.e. that people can be involved at various levels. These can start out very trivially: with the fact that training data or our knowledge organisation systems are generated by humans. Or the fact that you can use automatically generated keywords as suggestions – machine-assisted subject indexing.

There are also concepts such as online learning and active learning. Online learning means that the machine receives feedback relatively consistently from the indexer, as to how good its output was and based on that retraining takes place. Active learning is where the machine can interactively decide at certain points: I now need a person as an oracle for a partial decision. The machine initiates this, saying: “Human, I am pushing a few part-decisions that I need into the queue here – please work through them.” People and machines tend to toss the ball back and forth here, rather than doing it separately in two blocks.

Thank you for the interview, Anna and Frank.

In part 2 of the interview on “AI in Academic Libraries” we explore exciting projects regarding the future of chatbots and discrimination through AI.
Part 3 of the interview on “AI in Academic Libraries” focuses on prerequisites and conditions for successful use.
We’ll share the link here as soon as the post is published.

This text has been translated from German.

This might also interest you:

We were talking to:

Dr Anna Kasprzik, coordinator of the automation of subject indexing (AutoSE) at the ZBW – Leibniz Information Centre for Economics. Anna’s main focus lies on the transfer of current research results from the areas of machine learning, semantic technologies, semantic web and knowledge graphs into productive operations of subject indexing of the ZBW. You can also find Anna on Twitter and Mastodon.
Portrait: Photographer: Carola Gruebner, ZBW©

Dr Frank Seeliger (German) has been the director of the university library at the Technical University of Applied Sciences Wildau since 2006 and has been jointly responsible for the part-time programme Master of Science in Library Computer Sciences (M.Sc.) at the Wildau Institute of Technology since 2015. One module explores AI. You can find Frank on ORCID.
Portrait: TH Wildau

Featured Image: Alina Constantin / Better Images of AI / Handmade A.I / Licensed by CC-BY 4.0

The post AI in Academic Libraries, Part 1: Areas of Activity, Big Players and the Automation of Indexing first appeared on ZBW MediaTalk.

User Experience in Libraries: Insights from the SLU University Library Sweden

An Interview with Kitte Dahrén

The Swedish SLU University Library has about 50 employees. They are spread over several locations throughout the country; the main locations are Uppsala, Umeå and Alnarp. Kitte Dahrén is one of them. Her mission: to improve library services through user experience methods together with her colleagues.

For Kitte, it all started with a course on Design Thinking back in 2014: a pure epiphany for her. Since then, her potpourri of UX methods has grown steadily – usability tests, interviews, observations, cognitive mapping, card sorting…

In the interview, she tells us what her secret weapon is for motivating users, what her three most important learnings from seven years of User Experience are, why she considers it essential to bring all colleagues along, and what an onion has to do with it. Finally, Kitte also reveals who inspired her and gives book tips for UX beginners.

The interview is part of our series on User Experience in libraries. All interviews from the series can be found under the keyword “User Experience”.

Kitte, you are working in the field of User Experience (UX) at the library of the Swedish University of Agricultural Sciences (SLU). When and why did you start? What does that mean practically?

Back in 2014 I participated in a Design Thinking course which was kind of an epiphany for me. Before that, I often felt frustrated that librarians seemed to think that they focused on users’ needs, when they in fact just created services from their own point of view. During the course I learned how to research user problems and needs, prototype possible solutions and further iterate these. I felt empowered and finally had the tools needed to take action. This is where it all began for me on a personal level, but officially I got my position as UX Coordinator in 2017. At that point, User Experience was a goal in our library’s strategic plan and today the intention to work with user centred methods like UX methods is more established among staff, as well as in our management. UX work no longer depends on individuals being interested.

Illustration of the UX Button by Börje Dahrén©

My role is to coordinate the library’s internal method support called “The UX Button”, where I, together with my brilliant colleagues Ingela Wahlgren and Sarah Meier (who have graciously helped me with the answers to this interview) provide support to colleagues wanting to work with User Experience in order to improve services. The support is scalable, from just brainstorming potential UX methods to one of us being project leader. It all depends on the priority of the project and on how much time we can spare at that moment.

My role is to coordinate the library’s internal method support called “The UX Button”, where I, together with my brilliant colleagues Ingela Wahlgren and Sarah Meier (who have graciously helped me with the answers to this interview) provide support to colleagues wanting to work with User Experience in order to improve services. The support is scalable, from just brainstorming potential UX methods to one of us being project leader. It all depends on the priority of the project and on how much time we can spare at that moment.

The SLU University Library has around 50 employees, spread over different campuses all over the country. Just as the university itself, we work together as one library and in consequence the UX method support needed to rely on digital tools long before the pandemic.

What are your goals with UX? Did you achieve them?

Perhaps it goes without saying, but our main goal with UX at our library is of course to provide relevant and usable services and systems to our users. The work bears fruit slowly but steadily, and perhaps one explanation to the slowness is our way of embedding UX. We don’t want an expert team doing all UX work, we want everyone on board. In order to understand why we choose to embed User Experience in this way, you need to know that our organisational structure and culture is not hierarchical, and our library has a strong internal culture of co-creation. Our professional roles and job descriptions are not set in stone and there is a lot of room for self-leadership.

Illustration of the onion is by Kitte Dahrén, adapted from a model by Malin Jenslin©.

The model, originally made by Malin Jenslin, explains our concept for embedding UX on an organisational level. It is like an onion, with all its layers.

  1. The innermost circle, called the core, is the library’s internal UX support – “The UX Button”. Our job is to both deepen and broaden the organisation’s knowledge on UX methods, and it is our responsibility to make sure that our library continues to move forward towards our strategic goals.
  2. In the second circle, you will find colleagues who are actively working with User Experience methods in order to make sure that our users’ needs of our services and systems are met. It is our management’s responsibility to create the best possible conditions and organisational structures for us to be able to work like this.
  3. In the third circle, you’ll find the people who are aware of UX and how they might contribute to the goal, but they are not actively engaged in any UX activity from day to day.
  4. In the outermost circle, we have the people who are still unaware of what UX is all about.

The long-term goal is that the outer circle no longer exists. And when it is no longer there, the innermost circle is not needed at all. When all our colleagues are either actively working with UX or are aware of its importance, our work is done.

Which UX methods do you apply at the SLU University Library?

We always choose methods depending on what we want to uncover. Through the years we’ve done usability testing, interviews, observations, cognitive mapping, card sorting and much more. We like to try out new methods by applying them to an actual project – learning by doing. At the moment, we are for example interested in finding out how students are using the library when it is unstaffed (they can access it with their key cards). Is it easy to understand how to use the self-service machine or find a book on the hold shelf? To survey this, we plan to experiment with letting users themselves document how they perform basic library tasks using an action camera and this is a method completely new to us.

Can you give us a practical example that worked, where you applied UX to solve a problem?

I think it’s important to point out that many small changes to our services as a result of findings from UX research leads to improvements for our users. User Experience work doesn’t have to result in cutting-edge innovation to be considered a success story. One example from our library is a project where my colleagues were creating a new search tool for the databases that the library offers. With support from “The UX Button”, they did usability testing as described by Steve Krug in Rocket Surgery Made Easy – The Do-It-Yourself Guide to Finding and Fixing Usability Problems. Observations of test participants trying to use the tool revealed what problems needed to be solved before launching and resulted in a more useful product.

I really recommend Krug’s method for usability testing – it’s easy to set up, can be done remotely, and always leads to actionable insights. To observe a student or researcher using a service is a quite powerful (and sometimes even a bit painful) experience because it makes you realise that it’s perhaps not as self-explanatory as you might think. We will present our work with remote usability testing during the pandemic at the excellent conference International Conference in Performance Measurement in Libraries (LibPMC) in November.

To apply User Experience methods, you need library users who are willing to participate. How do you manage to find and motivate them?

Since a couple of years back, our number one solution is a library user panel. Everyone can join the panel, it does not matter whether they are students, staff or not affiliated with SLU at all. We strive to work against discrimination in our services, so we wish to create a panel that is as diverse as possible.

When we want to recruit for a user study, we simply send out an e-mail to selected members of the panel asking them to participate and in most cases a few people volunteer. Students will receive a small gift as a thank you for their time, usually a movie ticket. But our experience is that users see the gift as a bonus and that they are happy to contribute to the improvement of services and systems that they rely on in their work or studies.

Our user panel mainly consists of students, we’ve had a harder time finding researchers and other employees willing to sign up (but this might also be because we’ve mainly marketed the panel towards students). When we recruit university employees, we often need to rely on personal contacts but usually we find people willing to participate in the end.

What are the – lets say – three most important lessons you have learned from applying User Experience methods at the SLU University Library?

  1. Design is harder than research. It’s easy to just gather a lot of data on user behaviours and needs, but you must properly analyse this data and design solutions to test and further iterate if you want to improve your services. Make sure to solve the right problem and not just the lowest hanging fruit, and don’t fall in love with your solution.
  2. You need to have a great deal of patience to embed UX in your organisation. I want to point out once again that the UX Button team is not employed to conduct all user research. If we did, perhaps that would make the quality of the actual research better because we’ve got experience. It would speed up the process for sure. But I think that the fact that our colleagues have ownership of their own UX research and design process makes it easier to get approval in the long run.
  3. Sometimes colleagues initially find UX methods scary because it might push them outside their comfort zone asking a student to draw a cognitive map or interview a researcher about their publication process. Give them time to articulate their fears and doubts, but at the same time don’t be afraid to challenge them. Colleagues that previously claimed they are useless at for example interviewing or ideating new solutions often overcome their fears and excel when they are allowed to practice their skills without being judged.

Have you also used methods that did not work at all? What have been your biggest or funniest fails?

I don’t see it as methods that don’t work, it’s things like suboptimal circumstances, bureaucracy or just rushing to conclusions when you analyse your data that make your project fail. And even then, I wouldn’t call it a failure because you always learn something valuable during the process, either about your users or about yourself and your organisation.

A couple of years ago, we did a touchstone tour (PDF) with a student and she showed us a wall in one of the campus buildings covered with gold framed portraits of prominent figures from the history of the university. These portraits happened to be all male, and she told us how this “wall of shame“ made her “blood boil”. We prototyped a wall of photographs of female honorary doctors at SLU to show that times are changing, and when we tested it, students and employees welcomed it.

The wall of shame and the prototype female honorary doctors

Since our prototype was just temporary, we eventually took it down. I know that as a result of the study, the university management planned for a project aiming to create a more modern and inclusive environment in the public spaces of this particular building, but so far nothing has materialised. I really dislike when you borrow your users precious time to help you, and then fail to deliver solutions to the problems they express.

What are your tips for libraries that would like to start with UX? What is a good starting point?

Don’t try to move mountains the first thing you do. Start small, and preferably with something where you control the whole process and can act on stuff that you learn. Let’s say that you and your colleagues argue about some detail, solve it by simply asking or observing your users. In order to make UX truly embedded you need your management on board, but with time and patience, this way of working in your team can create a ripple effect in your organisation.

UX and Libraries – Recommendations from Kitte Dahrén

More about UX and libraries on ZBW MediaTalk

About the author
Kitte Dahrén works as UX coordinator and librarian at the SLU University Library, Sweden. She also coordinates the library’s strategic communication and is a part of the website editorial team.

Porträt: Kitte Dahrén©
Featured Image: Victor Wrange©

The post User Experience in Libraries: Insights from the SLU University Library Sweden first appeared on ZBW MediaTalk.