by Anissa Malady, Gina Storelli, Josh Roselle, Kimberly Hayden
This assignment allowed us to get a hands on approach to creating a digitization project utilizing a professional commercial software system. This is the paper details the outcome.
Online Collection Assignment Using OCLC Contentdm by Anissa Malady, Gina Storelli, Josh Roselle, Kimberly Hayden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
LIBR 284-02, Scott, Spring 2011
Hidden California, a collection of outdoor photography, consists of 28 digital photographs that display the hidden, natural beauty of the state of California, and makes available to users a broad range of images from across the wide expanse of Northern California.
Planning and Selection
In planning for this digitization project, we decided to go through our personal collections and select images that would contribute to creating a cohesive collection. Our goal was to create a collection that would be usable and relevant to a wide range of users. We discussed what types of images we had in common and narrowed it down to nature photography taken in Northern California. We predict our collection could be useful to a wide range of users, including researchers, students, hikers, tourists, and nature enthusiasts.
While evaluating our selection criteria, we focused on the concepts of access, research activities, and users’ needs. With our goals set, and our mission and materials selected, we moved forward to create a concrete plan for implementation. Vital to our planning was establishing deadlines and keeping an open line of communication throughout the process of developing the Hidden California collection. We established several Google Documents where we could post questions and comments and share ideas. We also met in an initial Elluminate meeting, where we came up with a plan of action and timeline, and divided up paper-writing and editing duties. We then each uploaded our images and added our metadata. We regularly kept in touch throughout the process via email and we checked each other’s work often.
Technical Production Information
Our group used born-digital images, and our cameras varied in quality, which contributed to varying digital properties. However, we discussed optimal specifications for our collection as if we were to digitize from analog.
As we learned from class, every digitization project is different, and we need to take into consideration the unique needs of a project rather than blindly follow a given standard. Unfortunately, because our images were born-digital, we could not test any of our selected specifications to ensure that our choices were acceptable. After looking over many guides and considering our project, we decided to base our specifications on the California Digital Library’s (CDL) Guidelines for Digital Images (2011).
We chose CDL’s guidelines for several reasons. First, CDL is a trusted source. As the digital library for the University of California system, they possess vast experience in digitization and digital collections. Second, as our fictional institution is based in California and is related to California nature, we decided it would be best to follow CDL’s guidelines in case we collaborate with them in the future. Finally, we liked the fact that these guidelines are up to date, as CDL revises and updates their guidelines each year.
Following are the specifications we decided to use:
- Pixel Array: 4,000 pixels across long dimension of image area
- Bit: 24-bit RGB mode (8-bit, 3-channel)
- PPI: 600 (for 4”x5” or 3.5”x5” originals)
Our group did not digitize analog images, and all of our images were in JPEG format; however, if we were digitizing our images from analog form, we would have created the following:
(1) Digital archival master. This would be in uncompressed TIFF format, scanned using the specifications above. We would not use these master files in our public database and they would only be used for archival purposes.
(2) Access version. This version would be used in our user accessible database. It would be in JPEG format and have a lower resolution and file size than our TIFF masters, would be sufficient for users to clearly view the images on a computer screen, and would be fairly quick to load.
(3) Thumbnail version. Although CONTENTdm creates this automatically, our thumbnail version would be in GIF format and would be intended for users to quickly scroll through our image collection and view the metadata simultaneously.
Metadata Fields and Formatting
In our group discussions, we ultimately decided to follow the content format set forth in the Illinois Digital Archives: Metadata Guidelines. We also consulted the University of Washington’s metadata guidelines and the Dublin Core website in order to determine which metadata fields would be most appropriate for our collection. Following the UW guidelines, we chose consistent formatting of our data in order to ensure reliable retrieval of records, to allow the records to be compatible with multiple databases for cross-searching, and to ease the maintenance and migration of data (Metadata guidelines, 2009).
Following are the 14 fields that we decided would be essential to the description of our collection:
- Digital Collection
- Subject (Library of Congress Subject Headings)
- Date Original
- Digitization Specifications
- Coverage (geographical)
We found that several fields were not relevant to our collection and decided to either remove them or leave them blank, including local subject, contributor, audience, relation, language, archival file, and OCLC#. We did not include the contributor field because the creators were the sole authors of each photo. Likewise, we did not use the language field because our collection is completely made up of images, and no language was necessary. We did not think it necessary to specify an audience, since we anticipate our collection would be of interest and use to a wide range of people. Similarly, there was simply no information to fill in the relation, archival file, and OCLC# fields.
Much of our discussions stemmed from the proper use and consistent formatting of terms for the subject headings and coverage fields. In the coverage fields, the records follow the format of the Getty Thesaurus of Graphic Names: country–state–county–city. We used Library of Congress Subject Headings and deemed the local subject heading field unnecessary because the LCSH headings were sufficient in breadth and depth to describe our images.
We determined that the description field should be as detailed as possible, accepting that it would result in some repetition of descriptive elements from other fields, like location. However, we restricted the scope of the description field to describing the content and story within the image, and we avoided information regarding the image’s creation or technical specifications.
We created a new field called “digitization specifications,” which is where we included technical information about the images, including the type of camera used to capture the images, the image dimensions, bit depth, and more. We decided not to specify whether images are born digital, because all of our images were taken on digital cameras. For the identifier field, we chose to use the identification number assigned to each photo from our digital cameras. We also added a field that stated the name of the collection the images belong to.
Working in team projects can often be a frightening and uncertain experience; however, we were fortunate to be grouped together with a great team that worked well and communicated frequently. We are all pleased with the outcome of our collection and the time and effort we put into it.
Overall, this project was a great experience and provided our team with the opportunity to put into practice concepts that we learned over the course of the term. We also appreciated having the chance to use an actual, professional database for our collection, even if it did take some time to figure out. We hope to be ready to use these skills in the future when we have the need to create digital collections.
CDL guidelines for digital images: version 2.0. (2011). Retrieved from www.cdlib.org/services/dsc/tools/docs/cdl_gdi_v2.pdf.
Getty Thesaurus of Geographic Names website. (n.d.). Retrieved from http://www.getty.edu/research/tools/vocabularies/tgn/index.html.
Hillmann, D. (2005). Using Dublin Core – the elements. Retrieved from http://dublincore.org/documents/usageguide/elements.shtml.
Humanities Advanced Technology and Information Institute, University of Glasgow, National Initiative for a Networked Cultural Heritage. (2002). The NINCH guide to good practice in the digital representation and management of cultural heritage materials: version 1.0. Retrieved from http://www.nyu.edu/its/humanities/ninchguide/.
Illinois Digital Archives: metadata guidelines. (2010). Retrieved from http://www.idaillinois.org/idaDataDictionary.pdf.
Library of Congress Authorities website. (2011). Retrieved from http://authorities.loc.gov/.
Metadata guidelines for collections using CONTENTdm. (2009). Retrieved from http://www.lib.washington.edu/msd/mig/advice/default.html.