-
作者:Zhong, Qixian; Mueller, Jonas; Wang, Jane-Ling
作者单位:Xiamen University; Amazon.com; University of California System; University of California Davis
摘要:While deep learning approaches to survival data have demonstrated empirical success in applications, most of these methods are difficult to interpret and mathematical understanding of them is lacking. This paper studies the partially linear Cox model, where the nonlinear component of the model is implemented using a deep neural network. The proposed approach is flexible and able to circumvent the curse of dimensionality, yet it facilitates interpretability of the effects of treatment covariate...
-
作者:Chen, Pinhan; Gao, Chao; Zhang, Anderson Y.
作者单位:University of Chicago; University of Pennsylvania
摘要:Given partially observed pairwise comparison data generated by the Bradley-Terry-Luce (BTL) model, we study the problem of top-k ranking. That is, to optimally identify the set of top-k players. We derive the minimax rate with respect to a normalized Hamming loss. This provides the first result in the literature that characterizes the partial recovery error in terms of the proportion of mistakes for top-k ranking. We also derive the optimal signal to noise ratio condition for the exact recover...
-
作者:Donoho, David L.; Kipnis, Alon
作者单位:Stanford University; Reichman University
摘要:We adapt Higher Criticism (HC) to the comparison of two frequency tables which may-or may not-exhibit moderate differences between the tables in some unknown, relatively small subset out of a large number of categories. Our analysis of the power of the proposed HC test quantifies the rarity and size of assumed differences and applies moderate deviations-analysis to determine the asymptotic powerfulness/powerlessness of our proposed HC procedure. Our analysis considers the null hypothesis of no...
-
作者:Rodriguez-Casal, Alberto; Saavedra-Nieves, Paula
作者单位:Universidade de Santiago de Compostela; Universidade de Santiago de Compostela
摘要:Given a random sample of points from some unknown density, we propose a method for estimating density level sets, for a positive threshold t, under the r-convexity assumption. This shape condition generalizes the convexity property and allows to consider level sets with more than one connected component. The main problem in practice is that r is an unknown geometric characteristic of the set related to its curvature, which may depend on t. A stochastic algorithm is proposed for selecting its v...
-
作者:Guo, Zijian; Cevid, Domagoj; Buhlmann, Peter
作者单位:Rutgers University System; Rutgers University New Brunswick; Swiss Federal Institutes of Technology Domain; ETH Zurich
摘要:Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected by hidden confounding and propose the doubly debiased lasso estimator for individual components of the regression coefficient vector. Our advocated method simultaneously corrects both the bias due to estimation of high-dimensional parameters as well as the bi...
-
作者:Cheysson, Felix; Lang, Gabriel
作者单位:INRAE; Universite Paris Saclay; AgroParisTech
摘要:This paper presents a parametric estimation method for ill-observed linear stationary Hawkes processes. When the exact locations of points are not observed, but only counts over time intervals of fixed size, methods based on the likelihood are not feasible. We show that spectral estimation based on Whittle's method is adapted to this case and provides consistent and asymptotically normal estimators, provided a mild moment condition on the reproduction function. Simulated data sets and a case-s...
-
作者:Zhang, Yunyi; Politis, Dimitris N.
作者单位:University of California System; University of California San Diego; University of California System; University of California San Diego
摘要:The success of the Lasso in the era of high-dimensional data can be attributed to its conducting an implicit model selection, that is, zeroing out regression coefficients that are not significant. By contrast, classical ridge regression cannot reveal a potential sparsity of parameters, and may also introduce a large bias under the high-dimensional setting. Nevertheless, recent work on the Lasso involves debiasing and thresholding, the latter in order to further enhance the model selection. As ...
-
作者:Huang, Hsueh-Han; Chan, Ngai Hang; Chen, Kun; Ing, Ching-Kang
作者单位:National Tsing Hua University; Chinese University of Hong Kong; Southwestern University of Finance & Economics - China
摘要:Estimating the orders of the autoregressive fractionally integrated moving average (ARFIMA) model has been a long-standing problem in time series analysis. This paper tackles this challenge by establishing the consistency of the Bayesian information criterion (BIC) for ARFIMA models with independent errors. Since the memory parameter of the model can be any real number, this consistency result is valid for short memory, long memory and nonstationary time series. This paper further extends the ...
-
作者:Ekvall, Karl Oskar; Bottai, Matteo
作者单位:Karolinska Institutet
摘要:We propose confidence regions with asymptotically correct uniform coverage probability of parameters whose Fisher information matrix can be singular at important points of the parameter set. Our work is motivated by the need for reliable inference on scale parameters close or equal to zero in mixed models, which is obtained as a special case. The confidence regions are constructed by inverting a continuous extension of the score test statistic standardized by expected information, which we sho...
-
作者:Schramm, Tselil; Wein, Alexander S.
作者单位:Stanford University; New York University
摘要:One fundamental goal of high-dimensional statistics is to detect or recover planted structure (such as a low-rank matrix) hidden in noisy data. A growing body of work studies low-degree polynomials as a restricted model of computation for such problems: it has been demonstrated in various settings that low-degree polynomials of the data can match the statistical performance of the best known polynomial-time algorithms. Prior work has studied the power of low-degree polynomials for the task of ...