您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 运营管理 > Operations Research > 2025 > 5期

Online Learning with Sample Selection Bias

成果类型：

Article

署名作者：

Singhvi, Divya; Singhvi, Somya

署名单位：

New York University; University of Southern California

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.2023.0223

发表日期：

2025

关键词：

bandits Donations PERSONALIZATION Sample-selection bias crowd-funding platforms operations for social-good

摘要：

We consider the problem of personalized recommendations on online platforms, where user preferences are unknown, and users interact with the platform through a series of sequential decisions (such as clicking to watch on video platforms or clicking to donate on donation platforms). The platform aims to maximize the final outcome (e.g., viewing duration on video platforms or donations on donation platforms). However, the platform only observes the final outcome for users who complete the first stage (clicking on the recommendation). The final outcome for users who do not complete the first stage (not clicking on the recommendation) remains unobserved (also referred to as funneling). This censoring of outcomes creates a selection bias issue, as the observed outcomes at different stages are often correlated. We demonstrate that failing to account for this selection bias results in biased estimates and suboptimal recommendations. In fact, well-performing personalized learning algorithms perform poorly and incur linear regret in this setting. Therefore, we propose the sample selection bandit (SSB) algorithm, which combines Heckman's two-step estimator with the optimism under uncertainty principle to address the sample selection bias issue. We show that the SSB algorithm achieves a rate-optimal regret root ffiffiffi rate (up to logarithmic terms) of O( T ). Furthermore, we conduct extensive numerical experiments on both synthetic data and real donation data collected from GoFundMe (a crowdfunding platform), demonstrating significant improvements over benchmark stateof-the-art learning algorithms in this setting.

来源URL：

访问原文