LEARNING WHILE EXPERIMENTING
成果类型:
Article
署名作者:
Damiano, Ettore; Li, Hao; Suen, Wing
署名单位:
University of Toronto; University of British Columbia; University of Hong Kong
刊物名称:
ECONOMIC JOURNAL
ISSN/ISSBN:
0013-0133
DOI:
10.1093/ej/uez043
发表日期:
2020
页码:
65-92
关键词:
development competition
Dynamic allocation
MODEL
摘要:
An agent performing risky experimentation can benefit from suspending it to learn directly about the state. 'Positive' information acquisition seeks news that would confirm the state that favours experimentation. It is used as a last-ditch effort when the agent is pessimistic about the risky arm before abandoning it. 'Negative' information acquisition seeks news that would demonstrate that experimentation is futile. It is used as an insurance strategy to avoid wasteful experimentation when the agent is still optimistic. A higher reward from risky experimentation expands the region of beliefs that the agent optimally chooses information acquisition rather than experimentation.