-
作者:Tan, Kean Ming; Sun, Qiang; Witten, Daniela
作者单位:University of Michigan System; University of Michigan; University of Toronto; University of Washington; University of Washington Seattle
摘要:We propose a sparse reduced rank Huber regression for analyzing large and complex high-dimensional data with heavy-tailed random noise. The proposed method is based on a convex relaxation of a rank-and sparsity-constrained nonconvex optimization problem, which is then solved using a block coordinate descent and an alternating direction method of multipliers algorithm. We establish nonasymptotic estimation error bounds under both Frobenius and nuclear norms in the high-dimensional setting. This...
-
作者:Guo, Xu; Ren, Haojie; Zou, Changliang; Li, Runze
作者单位:Beijing Normal University; Shanghai Jiao Tong University; Nankai University; Nankai University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Hard thresholding rule is commonly adopted in feature screening procedures to screen out unimportant predictors for ultrahigh-dimensional data. However, different thresholds are required to adapt to different contexts of screening problems and an appropriate thresholding magnitude usually varies from the model and error distribution. With an ad-hoc choice, it is unclear whether all of the important predictors are selected or not, and it is very likely that the procedures would include many uni...
-
作者:Wang, Jiangzhou; Zhang, Jingfei; Liu, Binghui; Zhu, Ji; Guo, Jianhua
作者单位:Northeast Normal University - China; Northeast Normal University - China; Southern University of Science & Technology; University of Miami; University of Michigan System; University of Michigan
摘要:The stochastic block model is one of the most studied network models for community detection, and fitting its likelihood function on large-scale networks is known to be challenging. One prominent work that overcomes this computational challenge is the fast pseudo-likelihood approach proposed by Amini et al. for fitting stochastic block models to large sparse networks. However, this approach does not have convergence guarantee, and may not be well suited for small and medium scale networks. In ...
-
作者:Park, Chan; Kang, Hyunseung
作者单位:University of Wisconsin System; University of Wisconsin Madison
摘要:Cluster randomized trials (CRTs) are a popular design to study the effect of interventions in infectious disease settings. However, standard analysis of CRTs primarily relies on strong parametric methods, usually mixed-effect models to account for the clustering structure, and focuses on the overall intent-to-treat (ITT) effect to evaluate effectiveness. The article presents two assumption-lean methods to analyze two types of effects in CRTs, ITT effects and network effects among well-known co...
-
作者:Crucinio, Francesca R.; Doucet, Arnaud; Johansen, Adam M.
作者单位:University of Warwick; University of Oxford; Alan Turing Institute
摘要:Fredholm integral equations of the first kind are the prototypical example of ill-posed linear inverse problems. They model, among other things, reconstruction of distorted noisy observations and indirect density estimation and also appear in instrumental variable regression. However, their numerical solution remains a challenging problem. Many techniques currently available require a preliminary discretization of the domain of the solution and make strong assumptions about its regularity. For...
-
作者:Liu, Yi; Rockova, Veronika
作者单位:University of Chicago; University of Chicago
摘要:Thompson sampling is a heuristic algorithm for the multi-armed bandit problem which has a long tradition in machine learning. The algorithm has a Bayesian spirit in the sense that it selects arms based on posterior samples of reward probabilities of each arm. By forging a connection between combinatorial binary bandits and spike-and-slab variable selection, we propose a stochastic optimization approach to subset selection called Thompson variable selection (TVS). TVS is a framework for interpr...
-
作者:Paparoditis, Efstathios; Shang, Han Lin
作者单位:University of Cyprus; Macquarie University
-
作者:Ni, Yang
作者单位:Texas A&M University System; Texas A&M University College Station
-
作者:Miao, Wang; Hu, Wenjie; Ogburn, Elizabeth L.; Zhou, Xiao-Hua
作者单位:Peking University; Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health; Peking University; Peking University
摘要:Identification of treatment effects in the presence of unmeasured confounding is a persistent problem in the social, biological, and medical sciences. The problem of unmeasured confounding in settings with multiple treatments is most common in statistical genetics and bioinformatics settings, where researchers have developed many successful statistical strategies without engaging deeply with the causal aspects of the problem. Recently there have been a number of attempts to bridge the gap betw...
-
作者:Deshpande, Yash; Javanmard, Adel; Mehrabi, Mohammad
作者单位:Massachusetts Institute of Technology (MIT); University of Southern California
摘要:Adaptive collection of data is commonplace in applications throughout science and engineering. From the point of view of statistical inference, however, adaptive data collection induces memory and correlation in the samples, and poses significant challenge. We consider the high-dimensional linear regression, where the samples are collected adaptively, and the sample size n can be smaller than p, the number of covariates. In this setting, there are two distinct sources of bias: the first due to...