What is OAIster?

OAIster is a union catalog of digital resources. We provide access to these digital resources by "harvesting" their descriptive metadata (records) using OAI-PMH (the Open Archives Initiative Protocol for Metadata Harvesting). The Open Archives Initiative is not the same thing as the Open Access movement.

ad for refrigerator from the Library of Congress   thumbnail of a sample document

Digital resources can range from an old-time advertisement of electric refrigerators (from the Library of Congress American Memory project) to Harriet Beecher Stowe memoirs (from the University of Michigan Digital Library Production Service Making of America collection).

Digital resources include items such as:

  • digitized (i.e., scanned) books and articles
  • born-digital texts
  • audio files (e.g., wav, mp3)
  • images (e.g., tiff, gif)
  • movies (e.g., mp4, quicktime)
  • datasets (e.g., downloadable statistics files)

These resources, often hidden from search engine users behind web scripts, are known as the "deep web." The owners of these resources share them with the world using OAI-PMH.

Digital resources are often hidden from the public because a web search using a search engine like Google or Yahoo! won't be picking up information about these resources. Robots in use by such search services don't delve into the CGI that sends this resource information to the web. Consequently, these resources are generally accessible only to those who know to look in particular repositories, often at universities who are developing the collections in these repositories.

OAIster reveals these digital resources in an easy-to-use, searchable interface. In addition, we aim to:

  • Provide one-stop "shopping" for users interested in useful, academically-oriented digital resources. We gather all potential digital resources out there in an effort to build a comprehensive digital union catalog.
  • Eliminate dead ends. Users retrieve not only descriptions (metadata) about resources, they have access to the real digital resources. For instance, instead of just the catalog records of a slide collection of Van Gogh's works, users are able to view images of the actual works.

How the Service Got Started

The service was originally funded through an Andrew W. Mellon Foundation grant. The original proposal was to establish a broad, generic retrieval service for information about publicly available digital library resources provided by the research library community.

The service was built through a collaboration with the University of Illinois at Urbana-Champaign (UIUC). Their metadata harvester was used for the first two years of the project. In our partnership with UIUC, we were an early release site for the harvester they developed. We developed mechanisms to regularly export and transform the harvested data. The open-source middleware that we use for the access system has been made available to other institutions for implementation as they see fit.

For more detail about the project, please read the original Mellon grant proposal. (Please note that in this proposal the timetable for the phases of work were moved back by 5 months.) You can read our final project report to the Mellon Foundation. Also, you can see our progress reports and the results of a survey we conducted early in the project.