您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 战略、组织行为与人力资源 > Journal of Applied Psychology > 2025 > 6期

Automated Speech Recognition Bias in Personnel Selection: The Case of Automatically Scored Job Interviews

成果类型：

Article

署名作者：

Hickman, Louis; Langer, Markus; Saef, Rachel M.; Tay, Louis

署名单位：

Virginia Polytechnic Institute & State University; University of Freiburg; Northern Illinois University; Purdue University System; Purdue University

刊物名称：

JOURNAL OF APPLIED PSYCHOLOGY

ISSN/ISSBN：

0021-9010

DOI：

10.1037/apl0001247

发表日期：

2025

页码：

846-858

关键词：

justice Artificial intelligence Whisper ADVERSE IMPACT experiment

摘要：

Organizations, researchers, and software increasingly use automatic speech recognition (ASR) to transcribe speech to text. However, ASR can be less accurate for (i.e., biased against) certain demographic subgroups. This is concerning, given that the machine-learning (ML) models used to automatically score video interviews use ASR transcriptions of interviewee responses as inputs. To address these concerns, we investigate the extent of ASR bias and its effects in automatically scored interviews. Specifically, we compare the accuracy of ASR transcription for English as a second language (ESL) versus non-ESL interviewees, people of color (and Black interviewees separately) versus White interviewees, and male versus female interviewees. Then, we test whether ASR bias causes bias in ML model scores-both in terms of differential convergent correlations (i.e., subgroup differences in correlations between observed and ML scores) and differential means (i.e., shifts in subgroup differences from observed to ML scores). To do so, we apply one human and four ASR transcription methods to two samples of mock video interviews (Ns = 1,014 and 414), and then we train and test models using these different transcripts to score multiple constructs. We observed significant bias in the commercial ASR services across nearly all comparisons, with the magnitude of bias differing across the ASR services. However, the transcription bias did not translate into meaningful measurement bias for the ML interview scores-whether in terms of differential convergent correlations or means. We discuss what these results mean for the nature of bias, fairness, and validity of ML models for scoring verbal open-ended responses.

来源URL：

访问原文