Skip to main content

This video requires you to be logged in to view it.

Session Recording

Data Mining & Delivery via API: Challenges & Strategies for Future Collection Development Recording

Filter

Developing collections that anticipate future needs of researchers can be challenging when content comes in an ever-increasing variety of formats. Faculty and students with data science skill sets, such as coding and text mining, are increasingly looking for research sources. This demand is felt across disciplines, as new tools empower researchers to create meaning from digitized collections and datasets.

It can be challenging to find, fund, acquire, and provide access to data for researchers. New data sources can be purchased or rented via subscription. A library collection may already have the content, but no physical, legal, or cost-efficient way to download it or mine it for data. Vendors may charge substantial additional fees for users to download greater quantities of content, export content in a different format, or transfer content into a third-party platform for further analysis. These costs are challenging for budgets and impacts an academic institution’s ability to support new methods of research and scholarship.

The presenters will highlight examples of these challenges in business and the humanities, specifically focusing on modes of access and pricing models. ‘Unlimited downloads,’ data feeds, API’s, and work bench models will be discussed. The speakers will also examine pricing models, including pricing by seat, pricing for ownership of digitized content, and pricing for subscriptions to data feeds and tools.

The presenters will share strategies for negotiating with vendors, as well as approaches for explaining the complicated nature of these agreements to stakeholders. Presenters will highlight collaborations between faculty researchers, academic departments, and libraries to create cost-sharing agreements that expand access to data on campus. Attendees will leave the session with strategies for working with commercial vendors to develop solutions to allow researchers to utilize data collections with prices that are sustainable for academic budgets.

Developing collections that anticipate future needs of researchers can be challenging when content comes in an ever-increasing variety of formats. Faculty and students with data science skill sets, such as coding and text mining, are increasingly looking for research sources. This demand is felt across disciplines, as new tools empower researchers to create meaning from digitized collections and datasets.

It can be challenging to find, fund, acquire, and provide access to data for researchers. New data sources can be purchased or rented via subscription. A library collection may already have the content, but no physical, legal, or cost-efficient way to download it or mine it for data. Vendors may charge substantial additional fees for users to download greater quantities of content, export content in a different format, or transfer content into a third-party platform for further analysis. These costs are challenging for budgets and impacts an academic institution’s ability to support new methods of research and scholarship.

The presenters will highlight examples of these challenges in business and the humanities, specifically focusing on modes of access and pricing models. ‘Unlimited downloads,’ data feeds, API’s, and work bench models will be discussed. The speakers will also examine pricing models, including pricing by seat, pricing for ownership of digitized content, and pricing for subscriptions to data feeds and tools.

The presenters will share strategies for negotiating with vendors, as well as approaches for explaining the complicated nature of these agreements to stakeholders. Presenters will highlight collaborations between faculty researchers, academic departments, and libraries to create cost-sharing agreements that expand access to data on campus. Attendees will leave the session with strategies for working with commercial vendors to develop solutions to allow researchers to utilize data collections with prices that are sustainable for academic budgets.

Daniel is the Head of the Business and Real Estate Libraries at New York University. His research focuses on the intersection of information seeking behavior and needs of business students, with a particular emphasis on career information literacy and MBA candidates. He did his graduate work at the University of Pittsburgh's School of Information Science, and has previously held positions at The Pennsylvania State University and Cornell University.