Three Years, Two Papers, One Course Off: Optimal Nonmonetary Reward Policies
成果类型:
Article
署名作者:
Gupta, Shivam; Chen, Wei; Dawande, Milind; Janakiraman, Ganesh
署名单位:
University of Nebraska System; University of Nebraska Lincoln; University of Kansas; University of Texas System; University of Texas Dallas
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.2022.4482
发表日期:
2023
页码:
2852-2869
关键词:
nonmonetary rewards
periodic review policies
principal-agent setting
摘要:
We consider a principal who periodically offers a fixed and costly nonmonetary reward to agents to incentivize them to invest effort over the long run. An agent's output, as a function of his effort, is a priori uncertain and is worth a fixed per-unit value to the principal. The principal's goal is to design an attractive reward policy that specifies how the rewards are to be given to an agent over time based on that agent's past performance. This problem, which we denote by P, is motivated by practical examples from both academia (e.g., a reduced teaching load) and industry (e.g., Supplier of the Year awards). The following limited-term (LT) reward policy structure has been quite popular in practice. The principal evaluates each agent periodically; if an agent's performance over a certain (limited) number of periods in the immediate past exceeds a predefined threshold, then the principal rewards him for a certain (limited) number of periods in the immediate future. When agents' outputs are deterministic in their efforts, we show that there always exists an optimal policy that is an LT policy and also, obtain such a policy. When agents' outputs are stochastic, we show that the class of LT policies may not contain any optimal policy of problem P but is guaranteed to contain policies that are arbitrarily near optimal. Given any epsilon > 0, we show how to obtain an LT policy whose performance is within e of that of an optimal policy. This guarantee depends crucially on the use of sufficiently long histories of the agents' outputs. We also analyze LT policies with short histories and derive structural insights on the role played by (i) the length of the available history and (ii) the variability in the random variable governing an agent's output. We show that the average performance of these policies is within 5% of the optimum, justifying their popularity in practice. We then introduce and analyze the class of score-based reward policies; we show that this class is guaranteed to contain an optimal policy and also, obtain such a policy. Finally, we analyze a generalization in which the principal has a limited number for rewards in any given period and show that the class of score-based policies, with modifications to accommodate the limited availability of the rewards, continues to contain an optimal solution for the principal.