-
作者:Li, Xiaocheng; Zhong, Huaiyang; Brandeau, Margaret L.
作者单位:Stanford University
摘要:The goal of a traditional Markov decision process (MDP) is to maximize expected cumulative reward over a defined horizon (possibly infinite). In many applications, however, a decision maker may be interested in optimizing a specific quantile of the cumulative reward instead of its expectation. In this paper, we consider the problem of optimizing the quantiles of the cumulative rewards of an MDP, which we refer to as a quantile Markov decision process (QMDP). We provide analytical results chara...
-
作者:Dey, Santanu S.; Mazumder, Rahul; Wang, Guanyi
作者单位:University System of Georgia; Georgia Institute of Technology; Massachusetts Institute of Technology (MIT)
摘要:Principal component analysis (PCA) is one of the most widely used dimensionality reduction tools in scientific data analysis. The PCA direction, given by the leading eigenvector of a covariance matrix, is a linear combination of all features with nonzero loadings; this impedes interpretability. Sparse principal component analysis (SPCA) is a framework that enhances interpretability by incorporating an additional sparsity requirement in the feature weights (factor loadings) while finding a dire...
-
作者:Birge, John R.; Feng, Yifan; Keskin, N. Bora; Schultz, Adam
作者单位:University of Chicago; National University of Singapore; Duke University; Uber Technologies, Inc.
摘要:We study the profit-maximization problem of a market maker in a spread betting market. In this market, the market maker quotes cutoff lines for the outcome of a certain future event as prices, and bettors bet on whether the event outcome exceeds the cutoff lines. Anonymous bettors with heterogeneous strategic behavior and information levels participate in the market. The market maker has limited information on the event outcome distribution, aiming to extract information from the market (i.e.,...
-
作者:Wang, Zhengli; Zenios, Stefanos
作者单位:University of Hong Kong; Stanford University
摘要:We model the creation of a new venture with a novel drift-variance diffusion control framework in which the state of the venture is captured by a diffusion process. The entrepreneur creating the venture chooses costly controls, which determine both the drift and the variance of the process. When the process reaches an upper boundary, the venture succeeds and the entrepreneur receives a reward. When the process reaches a lower boundary, the venture fails. The entrepreneur can choose between two...
-
作者:Perez-Salazar, Sebastian; Menache, Ishai; Singh, Mohit; Toriello, Alejandro
作者单位:University System of Georgia; Georgia Institute of Technology; Microsoft
摘要:Cloud computing has motivated renewed interest in resource allocation problems with new consumption models. A common goal is to share a resource, such as CPU or I/O bandwidth, among distinct users with different demand patterns as well as different quality of service requirements. To ensure these service requirements, cloud offerings often come with a service level agreement (SLA) between the provider and the users. A SLA specifies the amount of a resource a user is entitled to utilize. In man...
-
作者:Gao, Xuefeng; Gurbuzbalaban, Mert; Zhu, Lingjiong
作者单位:Chinese University of Hong Kong; Rutgers University System; Rutgers University New Brunswick; State University System of Florida; Florida State University
摘要:Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is a variant of stochastic gradients with momentum where a controlled and properly scaled Gaussian noise is added to the stochastic gradients to steer the iterates toward a global minimum. Many works report its empirical success in practice for solving stochastic nonconvex optimization problems; in particular, it has been observed to outperform overdamped Langevin Monte Carlo-based methods, such as stochastic gradient Langevin dynamics (SGLD)...
-
作者:Allon, Gad; Drakopoulos, Kimon; Manshadi, Vahideh
作者单位:University of Pennsylvania; University of Southern California; Yale University
摘要:In this paper, we study a model of information consumption in which consumers sequentially interact with a platform that offers a menu of signals (posts) about an underlying state of the world (fact). At each time, incapable of consuming all posts, consumers screen the posts and only select (and consume) one from the offered menu. We show that, in the presence of uncertainty about the accuracy of these posts and as the number of posts increases, adverse effects, such as slow learning and polar...
-
作者:Beyhaghi, Hedyeh; Golrezaei, Negin; Leme, Renato Paes; Pai, Martin; Sivan, Balasubramanian
作者单位:Toyota Technological Institute - Chicago; Massachusetts Institute of Technology (MIT); Alphabet Inc.; Google Incorporated
摘要:We study revenue maximization through sequential posted-price (SPP) mechanisms in single-dimensional settings with n buyers and independent but not necessarily identical value distributions. We construct the SPP mechanisms by considering the best of two simple pricing rules: one that imitates the revenue optimal mechanism, namely, the Myersonian mechanism, via the taxation principle and the other that posts a uniform price. Our pricing rules are rather generalizable and yield the first improve...
-
作者:Stolyar, Alexander L.; Wang, Qiong
作者单位:University of Illinois System; University of Illinois Urbana-Champaign; University of Illinois System; University of Illinois Urbana-Champaign
摘要:We study the classical single-item inventory system in which unsatisfied demands are backlogged. Replenishment lead times are random, independent identically distributed, causing orders to cross in time. We develop a new inventory policy to exploit implications of lead time randomness and order crossover, and evaluate its performance by asymptotic analysis and simulations. Our policy does not follow the basic principle of constant base stock (CBS) policy, or more generally, (s, S) and (R, q) p...
-
作者:Balseiro, Santiago R.; Kim, Anthony; Russo, Daniel
作者单位:Columbia University; Amazon.com
摘要:We consider a principal who repeatedly interacts with a strategic agent holding private information. In each round, the agent observes an idiosyncratic shock drawn independently and identically from a distribution known to the agent but not to the principal. The utilities of the principal and the agent are determined by the values of the shock and outcomes that are chosen by the principal based on reports made by the agent. When the principal commits to a dynamic mechanism, the agent best-resp...