MacMullen, W. John. A Research Design for Measuring Variation in Database Curators' Annotations Through Prospective Randomized Controlled Studies. Poster for the 3rd International Digital Curation Conference, Washington DC, December 2007.
A Research Design for Measuring Variation in Database Curators’ Annotations Through Prospective Randomized Controlled Studies.
W. John MacMullen
Graduate School of Library & Information Science, University of Illinois at Urbana-Champaign, 501 E. Daniel St., MC-493, Champaign, IL, 61820, USA


This project addresses the need for standardized research methodologies for the investigation of variation in workflows and outcomes used and produced by curators of digital repositories when performing standardized tasks, such as indexing, metadata generation, and ontology term assignment. Research on variation in curators’ outcomes (called here ‘annotations’) is important for several reasons, including: to understand the nature and extent of variation in curators’ annotations; to measure internal consistency of curators’ work, and to develop related quality metrics; and to learn from best practices to assist in the education and training of new curators. Standardized evaluation methodologies may also allow for the creation of benchmarking metrics, enabling cross-resource comparisons of such quality facets as consistency, reliability, specificity, completeness, and validity [1].

The experimental design was previously used to investigate variation in human curators’ Gene Ontology (GO) annotations in model organism databases [2], but is here described such that it is generally applicable to many contexts where multiple curators are performing curation or annotation tasks with documents or other forms of structured data and information. The research design consists of prospective randomized controlled studies, and includes discussions of document corpus construction, task formulation, documentand group assignment, resulting data and analysis, and contextual considerations.

