Digitize your home collections like a pro
By Sidney Gao, digital collections manager, and James Van Mil, digital projects and preservation librarian
In celebration of the upcoming World Digital Preservation Day on November 2, UC Libraries’ Digital Collections Team is here with some tips and tricks to help everyone preserve and protect their personal archives. Digital preservation combines policies, strategies and actions that ensure access to digital content over time[1]. These strategies can be used on both library digital collections and personal archives at home so that photographs, memories and history are preserved well into the future. In this article we’ll discuss how the UC Libraries Digital Collections Team works to preserve library digital collections, and how you can do the same for your own collections of photos, documents and more at home.
[1] https://www.ala.org/alcts/resources/preserv/defdigpres0408
File naming
Consistent and clear file naming is one of the most important steps for managing the contents of a digital collection across time. Because the Digital Collections Team usually works with digitized library collections, our filenames usually draw from identifiers associated with the collection. In our UC Protest Posters and Campus Unrest Collection, for example, filenames are drawn from the finding aids for the two archival collections that comprise this digital collection.
Example:
- protestPosters_UA-21_021.pdf – for a resource from our Protest and Political Posters Collection
- protestPosters_UA-04-12_006 – for a resource from our Campus Unrest Collection
When naming files, consider the most important information about the file; this will often be the first element of a filename. For a large collection of photos, a timestamp that lets you sort the items by the creation date may be first. For a collection of student papers from a course, an identifier for the student is probably more important than the name of the course (especially if you’re able to use a folder structure to provide more organization).
Bad filenames:
- Scan 1.tif
- Minutes December 2011.docx
- DCIM_00074.jpg
Better alternatives:
- studio-notepads-pg-001.tif
- party-planning-committee-minutes-2011-12.docx
- 2019-02-14-keynote-speaker01.jpg
In general, it is also good to avoid the use of spaces, mixed case or special characters in filenames, as different software interprets these elements differently. Consistent punctuation also helps make files more readable!
File storage
An organized file system is also key to maintaining a long term, accessible collection; it can save a lot of time when searching for specific files down the line. The Digital Collections Team organizes files using a tiered folder structure. Files are nested within a root folder in descending order based on collection and file type (eg. TIF or JPG).
When organizing files at home, perhaps a collection of family photos, consider what large themes run across the files that can be used for delineation. For example, a root folder of “Vacation Photos” might have subfolders named “Cancun” or “Paris” for each vacation.
It is also crucial to maintain a minimum of two copies of important files in case of corruption or loss. The Digital Collections Team maintains 2-3 copies of our “master” or high quality files spread across a combination of local and cloud servers. You can do this at home by saving files on two different computers, or uploading a second copy of your files to cloud-based storage servers.
Image Caption: The Digital Collections Team organizes files by collection to maintain a searchable and human readable database.
Digitization and file format standards
When digitizing collections, our team carefully considers the best file format for both sharing and storing an item. For example, some of our high quality images can be a gigabyte or larger, which is a lot more than a typical patron needs. Because of this, we often share compressed JPEGs in our repositories, but keep high-resolution images in storage for when special cases arise.
We also consider the future of file formats in our custody, and aim to use the file formats that will stay accessible alongside the passage of time and the advancement of technology.
Recommended Formats Statement from the Library of Congress
Global Bit List of Endangered Digital Species
When digitizing analog or paper photographs at home, consider scanning at a higher quality using the TIF file format. This will help you maintain higher quality, clearer images. If you need to share photos with friends and family, you can use your high quality TIFs to create smaller JPGs for easy transfer.
In conclusion
When thinking about digital preservation, is it imperative to consider the legacy of the files being preserved. How do you want them to be used in the future? Who do you want to use them? At UC Libraries, the Digital Collections Team applies digital preservation principles to preserve and protect the history contained in our special and digital collections for future students, researchers and the community. At home, you can apply the same simple principles to ensure that generations of family will cherish memories captured in photographs, or that important documents are always accessible when needed.