Sequential Interdiction with Incomplete Information and Learning

成果类型:
Article
署名作者:
Borrero, Juan S.; Prokopyev, Oleg A.; Saure, Denis
署名单位:
Oklahoma State University System; Oklahoma State University - Stillwater; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; Universidad de Chile
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2018.1773
发表日期:
2019
页码:
72-89
关键词:
bilevel algorithm
摘要:
We present a framework for a class of sequential decision-making problems in the context of general interdiction problems, in which a leader and a follower repeatedly interact. At each period, the leader allocates resources to disrupt the performance of the follower (e.g., as in defender-attacker or network interdiction problems), who, in turn, minimizes some cost function over a set of activities that depends on the leader's decision. Although the follower has complete knowledge of the follower's problem, the leader has only partial information and needs to learn about the cost parameters, available resources, and the follower's activities from the feedback generated by the follower's actions. We measure policy performance in terms of its time-stability, defined as the number of periods it takes for the leader to match the actions of an oracle with complete information. In particular, we propose a class of greedy and robust policies and show that these policies are weakly optimal, eventually match the oracle's actions, and provide a real-time certificate of optimality. We also study a lower bound on any policy performance based on the notion of a semioracle. Our numerical experiments demonstrate that the proposed policies consistently outperform a reasonable benchmark and perform fairly close to the semioracle.