Abstract: Open access libraries operate in a continuum between two distinct organisation models: online retailers versus ‘traditional’ libraries. Online retailers such as Amazon.com are successful in recom-mending additional items that match the specific needs of their customers. The success rate of the recommendation depends on knowledge of the individual customer: more knowledge about persons leads to better suggestions. Thus, to optimally profit from the retailers’ offerings, the client must be prepared to share personal information, leading to the question of privacy.
In contrast, protection of privacy is a core value for libraries. The question is how open access librar-ies can offer comparable services while retaining the readers’ privacy. A possible solution can be found in analysing the preferences of groups of like-minded people: communities. According to Lynch (2002), digital libraries are bad at identifying or predicting the communities that will use their collections. It is however our intention to explore the possibility to uncover sets of documents with a meaningful connection for groups of readers – the communities. The solution depends on examining patterns of usage, instead of storing information about individual readers.
This paper will investigate the possibility to uncover the preferences of user groups within an open access digital library using social networking analysis techniques.
Abstract: Explicit semantic enrichments make digital scholarly publications potentially easy to find, to navigate, to organize and to understand. But whereas the generation of explicit semantic information is common in fields like biomedical research, comparable approaches are rare for domains in the humanities. Apart from a lack of authoritive structured knowledge bases formalizing the respective conceptualizations and terminologies, many experts from specialized fields of research seem reluctant to employ the technologies and methods that are currently available for the generation of structured knowledge representations. However, human involvement is indispensable in the organization and application of the domain-specific knowledge representations necessary for the contextualization of structured semantic data extracted from textual and scholarly resources. Over the past decade, various efforts have been made towards openly accessible online knowledge graphs containing collaboratively edited, structured and cross-linked data. Such public knowledge bases might be suitable as a starting point for defining formalized domain knowledge representations, with which the subjects and findings of a research domain can be described. Extensive re-use of the widely adopted shared conceptualizations from a large collaborative knowledge base could be in more than one way beneficial to processes of semantic enrichment, especially those involving domain experts with less-technical backgrounds. In this work, we discuss ways of enabling domain experts to semantically enrich their research resources by generating semantic annotations in text documents using the scholarly reading and annotation software neonion. We introduce features to the web-based software which improve various aspects of the semantic annotation process by connecting it to the collaboratively edited public knowledge base Wikidata. Furthermore, we argue that the re-use of external structured knowledge from Wikidata both fuels an enhanced workflow for assisted subject-matter-sensitive semantic annotation, and allows for the knowledge base to benefit from the structured data generated within neonion in return. Our prototype implementation extracts schematic terminological information from Wikidata objects linked by local annotations and feeds it into the new recommender system, where candidate descriptors for vocabulary amendment are being determined, most notably by the association rule mining recommender engine Snoopy. This paper is a follow-up on the bachelor’s thesis “Assisting in semantic development of knowledge domains by recommending terminology”, submitted by Jakob Höper under supervision by Prof. Dr. Claudia Müller-Birn. It elaborates in further detail on aspects of the presented implementation that have not been exhaustively covered in said thesis.
“With the amount of published research, patents, white papers and other written knowledge out there, it’s hard to be even reasonably sure you’re aware of the goings-on around a certain topic or field. Omnity is a search engine made to make it easier by extracting the gist of documents you give it and finding related ones from a library of millions — and now supports more than a hundred languages.
The process is simple and free, at least for the public-facing databases Omnity has assembled, comprising U.S. patents, SEC filings, PubMed papers, clinical trials, Library of Congress collections and more.
You upload a document or text snippet and the system scans it, looking for the least common words and phrases — which generally indicate things like topic, experiment type, equipment used, that sort of thing. It then looks through its own libraries to find documents with similar or related phrases that appear in a manner that suggests relevance….”