About Digital Collections

Mission

Digital Collections provides access to the National Library of Medicine's distinctive digital content in the areas of biomedicine, health care and the history of medicine. Our unique digital collections are freely available for download worldwide and in the public domain unless otherwise indicated.


About the Collections

Collection Development Policy
The policies and guidelines for building NLM’s Digital Collections are described in the Collection Development Guidelines of the National Library of Medicine. The Guidelines define the range of subjects to be acquired and the extent of the Library's collecting effort within these subjects. They also address selection issues presented by a range of formats and literature types. The Guidelines are reviewed and updated periodically to reflect emerging changes in health care and advances in medical research.

Texts
Most of the texts within Digital Collections that are held in NLM's physical collection were digitized at NLM using docWorks (dW) image processing software, which produces several files per page and per book. After cropping, deskewing and reviewing the source images, NLM-defined scripts are used to create additional image derivatives and metadata. A small number of texts were digitized from originals or microfilm by a vendor off-site.
The texts comprising the Medicine in the Americas collection were digitized for a multi-institutional digital library project, the Medical Heritage Library. Although the project ceased in 2024, the collection continues to be hosted in Internet Archive. NLM routinely deposits copies of its digitized books to Internet Archive and the Medical Heritage Library collection.

Films & Videos
Most of the films and video recordings available in Digital Collections originate from NLM's motion picture reel and videotape holdings. Streaming versions of these titles are derived from a range of master formats. Numerous medical and scientific topics, including public health, surgical procedures, mental health, child development, cancer, infectious disease, and substance abuse are addressed. Each film or video with sound is transcribed, and time-coded captions are created to satisfy Section 508 accessibility requirements and provide enhanced search functionality.

Still Images
The majority of still images in Digital Collections are part of the Images from the History of Medicine collection, including fine art, photographs, engravings, and posters that illustrate the social and historical aspects of medicine dating from the 15th to 21st century. Still images are available as a master TIFF or a standard JPG file.

Archival Materials
The more than 40 digitized Archives and Personal Papers Collections feature thousands of archival materials covering public health and health policy, mental health, child development, and molecular biology in the 19th and 20th centuries. Also included are individual oral histories and oral history collections consisting of interviews with physicians, scientists, and government administrators. Digital Collections also provides access to the more than 30,000 archival items that comprise the curated Profiles in Science collections documenting the modern trailblazers of science, medicine, and public health.

Software
The software available in the collection includes historical software developed by NLM, the interactive tutorial for Grateful Med, How To Grateful Med.

Digitization
The specifications for NLM Digital Repository objects can be found here.


Repository Technical Overview

Software

The Digital Collections repository is primarily composed of open source technologies. The Fedora Repository Software provides an underlying XML-based framework for structuring, managing, preserving and disseminating digital content. Apache Solr and Lucene are used to index our content and drive the full-text and faceted metadata searching within Digital Collections.

The website's homepage, search functionality and resource summary pages are provided using the Blacklight open-source discovery interface.

Digitized texts are presented via Universal Viewer, a community-developed open source project, developed by Digirati. The repository's stored JPEG2000 page images are dynamically converted to regular JPEG for display via the IIIF AWS Serverless Application.

Images are provided by the IIIF AWS Serverless Application and presented using the Universal Viewer.

Digitized films are presented using VideoJS embedded on the page.

Preservation
Digital Collections uses the following strategies to help ensure the durability of the managed content:

  • • Every master file (source images and videos) is stored with an MD5 checksum, a numerical value unique to the file which can be recomputed to ensure the file has not been altered. Checksums are verified periodically to ensure the integrity of the content.
  • • All repository content and services are replicated at a secondary data center capable of taking over all repository functions if NLM's main data center is unavailable. A third copy of master content is stored off-site at a third-party location.

Web Service
Digital Collections offers a Web Service that facilitates programmatic search of the Dublin Core metadata and full-text OCR in the repository, with search requests and responses in XML format. More information, including the specifications of the service request and output, is available here.


History

For more information on the history of NLM's repository development, including the initial functional requirements and software evaluations, see the digital repository project history page.

For information about Digital Collections, contact us.

Page last updated: August 2024