EXPECTING THE UNEXPECTED: EFFECTS OF DATA COLLECTION DESIGN CHOICES ON THE QUALITY OF CROWDSOURCED USER-GENERATED CONTENT
成果类型:
Article
署名作者:
Lukyanenko, Roman; Parsons, Jeffrey; Wiersma, Yolanda F.; Maddah, Mahed
署名单位:
Universite de Montreal; HEC Montreal; Memorial University Newfoundland; Memorial University Newfoundland; Suffolk University
刊物名称:
MIS QUARTERLY
ISSN/ISSBN:
0276-7783
DOI:
10.25300/MISQ/2019/14439
发表日期:
2019
页码:
623-+
关键词:
citizen science
Social media
INFORMATION
systems
KNOWLEDGE
technologies
perceptions
foundation
CHALLENGES
instances
摘要:
As crowdsourced user-generated content becomes an important source of data for organizations, a pressing question is how to ensure that data contributed by ordinary people outside of traditional organizational boundaries is of suitable quality to be useful for both known and unanticipated purposes. This research examines the impact of different information quality management strategies, and corresponding data collection design choices, on key dimensions of information quality in crowdsourced user-generated content. We conceptualize a contributor-centric information quality management approach focusing on instance-based data collection. We contrast it with the traditional consumer-centric fitness-for-use conceptualization of information quality that emphasizes class-based data collection. We present laboratory and field experiments conducted in a citizen science domain that demonstrate trade-offs between the quality dimensions of accuracy, completeness (including discoveries), and precision between the two information management approaches and their corresponding data collection designs. Specifically, we show that instance-based data collection results in higher accuracy, dataset completeness, and number of discoveries, but this comes at the expense of lower precision. We further validate the practical value of the instance-based approach by conducting an applicability check with potential data consumers (scientists, in our context of citizen science). In a follow-up study, we show, using human experts and supervised machine learning techniques, that substantial precision gains on instance-based data can be achieved with post-processing. We conclude by discussing the benefits and limitations of different information quality and data collection design choices for information quality in crowdsourced user-generated content.