Langevin Dynamics Based Algorithm e-THεO POULA for Stochastic Optimization Problems with Discontinuous Stochastic Gradient
成果类型:
Article
署名作者:
Lim, Dong-Young; Neufeld, Ariel; Sabanis, Sotirios; Zhang, Ying
署名单位:
Ulsan National Institute of Science & Technology (UNIST); Nanyang Technological University; University of Edinburgh; Alan Turing Institute; National Technical University of Athens; Hong Kong University of Science & Technology (Guangzhou)
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.2022.0307
发表日期:
2025
关键词:
dependent data streams
strong-convergence
摘要:
We introduce a new Langevin dynamics based algorithm, called the extended tamed hybrid epsilon-order polygonal unadjusted Langevin algorithm (e-TH epsilon O POULA), to solve optimization problems with discontinuous stochastic gradients, which naturally appear in real-world applications such as quantile estimation, vector quantization, conditional value at risk (CVaR) minimization, and regularized optimization problems involving rectified linear unit (ReLU) neural networks. We demonstrate both theoretically and numerically the applicability of the e-TH epsilon O POULA algorithm. More precisely, under the conditions that the stochastic gradient is locally Lipschitz in average and satisfies a certain convexity at infinity condition, we establish nonasymptotic error bounds for e-TH epsilon O POULA in Wasserstein distances and provide a nonasymptotic estimate for the expected excess risk, which can be controlled to be arbitrarily small. Three key applications in finance and insurance are provided, namely, multiperiod portfolio optimization, transfer learning in multiperiod portfolio optimization, and insurance claim prediction, which involve neural networks with (Leaky)ReLU activation functions. Numerical experiments conducted using real-world data sets illustrate the superior empirical performance of e-TH epsilon O POULA compared with SGLD (stochastic gradient Langevin dynamics), TUSLA (tamed unadjusted stochastic Langevin algorithm), adaptive moment estimation, and Adaptive Moment Estimation with a Strongly Non-Convex Decaying Learning Rate in terms of model accuracy.