-
作者:Bhandari, Jalaj; Russo, Daniel
作者单位:Columbia University; Columbia University
摘要:Policy gradients methods apply to complex, poorly understood, control problems by performing stochastic gradient descent over a parameterized class of polices. Unfortunately, even for simple control problems solvable by standard dynamic programming techniques, policy gradient algorithms face nonconvex optimization problems and are widely understood to converge only to a stationary point. This work identifies structural properties, shared by several classic control problems, that ensure the pol...
-
作者:Shi, Yunting; Liu, Nan; Wan, Guohua
作者单位:Shanghai Jiao Tong University; Boston College
摘要:The current emergency response guidelines suggest giving priority of treatment to those victims whose initial health conditions are more critical. Although this makes intuitive sense, it does not consider potential deterioration of less critical victims. Deterioration may lead to longer treatment time and irrecoverable health damage, but could be avoided if these victims were to receive care in time. Informed by a unique timestamps data set of surgeries carried out in a field hospital set up i...
-
作者:Carlsson, John Gunnar; Liu, Sheng; Salari, Nooshin; Yu, Han
作者单位:University of Southern California; University of Toronto; University of Alberta; McMaster University
摘要:On-time last-mile delivery is expanding rapidly as people expect faster delivery of goods ranging from grocery to medicines. Managing on-time delivery systems is challenging because of the underlying uncertainties and combinatorial nature of the routing decision. In practice, the efficiency of such systems also hinges on the driver's familiarity with the local neighborhood. This paper studies the optimal region partitioning policy to minimize the expected delivery time of customer orders in a ...
-
作者:Poursoltani, Mehran; Delage, Erick; Georghiou, Angelos
作者单位:Universite de Montreal; HEC Montreal; Universite de Montreal; HEC Montreal; University of Cyprus
摘要:Within the context of optimization under uncertainty, a well-known alternative to minimizing expected value or the worst-case scenario consists in minimizing regret. In a multistage stochastic programming setting with a discrete probability distribution, we explore the idea of risk-averse regret minimization, where the benchmark policy can only benefit from foreseeing increment steps into the future. The increment -regret model naturally interpolates between the popular ex ante and ex post reg...
-
作者:Du, Lilun; Li, Qing; Yu, Peiwen
作者单位:City University of Hong Kong; Hong Kong University of Science & Technology; Chongqing University
摘要:We model a multiphase and high-volume recruitment process as a large-scale dynamic program. The success of the process is measured by a reward, which is the total assessment score of accepted candidates minus the penalty cost of the number of accepted candidates in the end deviating from a preset hiring target. For a recruiter, two questions are important: How many offers should be made in each phase? And how does the number of phases affect the reward? We consider an upper bound, which is obt...
-
作者:Acemoglu, Daron; Makhdoumi, Ali; Malekian, Azarakhsh; Ozdaglar, Asuman
作者单位:Massachusetts Institute of Technology (MIT); Duke University; University of Toronto; Massachusetts Institute of Technology (MIT)
摘要:We study the effects of testing policy on voluntary social distancing and the spread of an infection. Agents decide their social activity level, which determines a social network over which the virus spreads. Testing enables the isolation of infected individuals, slowing down the infection. However, greater testing also reduces voluntary social distancing or increases social activity, exacerbating the spread of the virus. We show that the effect of testing on infections is nonmonotone. This no...
-
作者:Daryalal, Maryam; Bodur, Merve; Luedtke, James R.
作者单位:Universite de Montreal; HEC Montreal; University of Toronto; University of Wisconsin System; University of Wisconsin Madison
摘要:Multistage stochastic programs can be approximated by restricting policies to follow decision rules. Directly applying this idea to problems with integer decisions is difficult because of the need for decision rules that lead to integral decisions. In this work, we introduce Lagrangian dual decision rules (LDDRs) for multistage stochastic mixed-integer programming (MSMIP), which overcome this difficulty by applying decision rules in a Lagrangian dual of the MSMIP. We propose two new bounding t...
-
作者:Scroccaro, Pedro Zattoni; Atasoy, Bilge; Esfahani, Peyman Mohajerin
作者单位:Delft University of Technology; Delft University of Technology
摘要:This article may be used only for the purposes of research, teaching, and/or private study. Commercial use or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher approval, unless otherwise noted. For more information, contact permissions@informs.org. The Publisher does not warrant or guarantee the article's accuracy, completeness, merchantability, fitness inclusion of an advertisement in this article, neither constitutes nor implies a guaran...
-
作者:Gao, Xuefeng; Huang, Junfei; Zhang, Jiheng
作者单位:Chinese University of Hong Kong; Chinese University of Hong Kong; Hong Kong University of Science & Technology
摘要:Motivated by the recent popularity of omnichannel service systems, we analyze the joint admission and scheduling control of a queueing system with two classes of customers: online and walk-in. Unlike walk-in customers, online customers are given a target time for pick up upon placing an order. Thus, in addition to minimizing the waiting costs of walk-in customers and the rejection cost of both classes, we need to minimize the earliness and tardiness costs of online customers. Such a distinctiv...
-
作者:Kennedy, Adrian P.; Sethi, Suresh P.; Siu, Chi Chung; Yam, Sheung Chi Phillip
作者单位:Chinese University of Hong Kong; University of Texas System; University of Texas Dallas; Hang Seng University of Hong Kong
摘要:We propose a flexible yet tractable dynamic advertising model called the generalized Sethi model to capture different market penetration rates across various media and markets via advertising. Specifically, the generalized Sethi model employs a Cobb-Douglas production function of advertising expenditure and the untapped market share with constant returns to scale. It encompasses some standard dynamic advertising models as particular cases. Moreover, the model's flexibility does not compromise ...