Researchers today face a proliferation of AI tools and utilities, many of them from small startup companies, to assist in the scholarly research process, such as facilitating the writing and submission process, or the discovery of new content. Paradoxically, the number of tools available makes it more difficult for researchers to evaluate and identify which ones to recommend for researchers (in the US alone there were 1,393 AI startup companies in 2018 (Statista)). Information scientists in libraries, with their expertise in bibliometrics and understanding of the research process, are perfectly placed to provide guidance and evaluation. This talk argues that the skills required to evaluate these tools do not require knowledge of Python or indeed of any computer programming. Instead, the talk outlines an assessment framework for evaluating new AI-based tools, from an evaluation of the corpus to being able to measure accuracy and effectiveness, as well as being able to identify bias and inequalities.
The evaluationframework is exemplified in a case study. This comprises a measured trial of some of these tools, comparing human and machine-facilitated approaches to the research process, and comparing time taken and quality of results, using data from a health library. Until now, much of the help provided by libraries in the form of fact sheets and how-to guides has been evaluated only subjectively; this case studies aims to show a better way to evaluate these tools, using statistically valid samples plus feedback from users to identify how widely used and how successful these tools are.
Anyone attending this presentation should be better equipped to evaluate new AI utilities for research and submission.