-
作者:Vera, Alberto; Banerjee, Siddhartha; Gurvich, Itai
作者单位:Cornell University; Northwestern University
摘要:We develop a framework for designing simple and efficient policies for a family of online allocation and pricing problems that includes online packing, budget-constrained probing, dynamic pricing, and online contextual bandits with knapsacks. In each case, we evaluate the performance of our policies in terms of their regret (i.e., additive gap) relative to an offline controller that is endowed with more information than the online controller. Our framework is based on Bellman inequalities, whi...
-
作者:Alptekinoglu, Aydin; Semple, John H.
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Southern Methodist University
摘要:We investigate analytical and empirical properties of the Heteroscedastic Exponomial Choice (HEC) model to lay the groundwork for its use in theoretical and empirical studies that build demand models on a discrete choice foundation. The HEC model generalizes the Exponomial Choice (EC) model by including choice-specific variances for the random components of utility (the error terms). We show that the HEC model inherits some of the properties found in the EC model: closed-form choice probabilit...
-
作者:Balseiro, Santiago; Kim, Anthony; Mahdian, Mohammad; Mirrokni, Vahab
作者单位:Columbia University; Amazon.com; Alphabet Inc.; Google Incorporated
摘要:In online advertising, advertisers purchase ad placements by participating in a long sequence of repeated auctions. One of the most important features that advertising platforms often provide and advertisers often use is budget management, which allows advertisers to control their cumulative expenditures. Advertisers typically declare the maximum daily amount they are willing to pay, and the platform adjusts allocations and payments to guarantee that cumulative expenditures do not exceed budge...
-
作者:Chan, Timothy C. Y.; Fernandes, Craig; Puterman, Martin L.
作者单位:University of Toronto; University of British Columbia
摘要:To develop a novel approach for performance assessment, this paper considers the problem of computing value functions in professional American football. We provide a theoretical justification for using a dynamic programming approach to estimating value functions in sports by formulating the problem as a Markov chain for two asymmetric teams. We show that the Bellman equation has a unique solution equal to the bias of the underlying infinite horizon Markov reward process. This result provides, ...
-
作者:Tian, Feng; Sun, Peng; Duenyas, Izak
作者单位:University of Michigan System; University of Michigan; Duke University
摘要:A principal hires an agent to repair a machine when it is down and maintain it when it is up and earns a revenue flow when the machine is up. Both the up- and downtimes follow exponential distributions. If the agent exerts effort, the downtime is shortened, and uptime is prolonged. Effort, however, is costly to the agent and unobservable to the principal. We study optimal dynamic contracts that always induce the agent to exert effort while maximizing the principal's profits. We formulate the c...
-
作者:Bhandari, Jalaj; Russo, Daniel; Singal, Raghav
作者单位:Columbia University; Columbia University
摘要:Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement learning, its theoretical analysis has proved challenging and few guarantees on its statistical efficiency are available. In this work, we provide a simple and explicit finite time analysis of temporal difference learning with linear function approximation. Excep...
-
作者:Chen, Ningyuan; Gallego, Guillermo
作者单位:University of Toronto; Hong Kong University of Science & Technology
摘要:Personalized pricing analytics is becoming an essential tool in retailing. Upon observing the personalized information of each arriving customer, the firm needs to set a price accordingly based on the covariates, such as income, education background, and past purchasing history, to extract more revenue. For new entrants of the business, the lack of historical data may severely limit the power and profitability of personalized pricing. We propose a nonparametric pricing policy to simultaneously...
-
作者:Blanchet, Jose; Kang, Yang
作者单位:Stanford University; Columbia University
摘要:We present a novel inference approach that we call sample out-of-sample inference. The approach can be used widely, ranging from semisupervised learning to stress testing, and it is fundamental in the application of data-driven distributionally robust optimization. Our method enables measuring the impact of plausible out-of-sample scenarios in a given performance measure of interest, such as a financial loss. The methodology is inspired by empirical likelihood (EL), but we optimize the empiric...
-
作者:Hwang, Dawsen; Jaillet, Patrick; Manshadi, Vahideh
作者单位:Alphabet Inc.; Google Incorporated; Massachusetts Institute of Technology (MIT); Yale University
摘要:For online resource allocation problems, we propose a new demand arrival model where the sequence of arrivals contains both an adversarial component and a stochastic one. Our model requires no demand forecasting; however, because of the presence of the stochastic component, we can partially predict future demand as the sequence of arrivals unfolds. Under the proposed model, we study the problem of the online allocation of a single resource to two types of customers and design online algorithms...
-
作者:Zhu, Han; Chen, Youhua (Frank); Hu, Ming; Yang, Yi
作者单位:Dongbei University of Finance & Economics; City University of Hong Kong; University of Toronto; Zhejiang University
摘要:We study a continuous-review, two-echelon inventory system with one central warehouse, multiple local facilities, and each facility facing random demand. Local facilities replenish their stock from the central warehouse (or distribution center), which in turn places orders at an outside supplier with ample supply. Inventory replenishment at each location incurs a fixed-plus-variable cost for each shipment. The optimal policy remains unknown, and even if it exists, such a policy must be extreme...