Take caution in using LLMs as human surrogates

成果类型:
Article
署名作者:
Gao, Yuan; Lee, Dokyun; Burtch, Gordon; Fazelpour, Sina
署名单位:
Boston University; Boston University; Northeastern University; Northeastern University
刊物名称:
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
ISSN/ISSBN:
0027-10107
DOI:
10.1073/pnas.2501660122
发表日期:
2025-06-17
关键词:
ai
摘要:
Recent studies suggest large language models (LLMs) can generate human-like responses, aligning with human behavior in economic experiments, surveys, and political discourse. This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research. However, LLMs differ fundamentally from humans, relying on probabilistic patterns, absent the embodied experiences or survival objectives that shape human cognition. We assess the reasoning depth of LLMs using the 11-20 money request game. Nearly all advanced approaches fail to replicate human behavior distributions across many models. The causes of failure are diverse and unpredictable, relating to input language, roles, safeguarding, and more. These results warrant caution in using LLMs as surrogates or for simulating human behavior in research.