-
作者:Hassani, Hamed; Javanmard, Adel
作者单位:University of Pennsylvania; University of Southern California
摘要:Successful deep learning models often involve training neural network architectures that contain more parameters than the number of training samples. Such overparametrized models have recently been extensively studied, and the virtues of overparametrization have been established from both the statistical perspective, via the double-descent phenomenon, and the computational perspective via the structural properties of the optimization landscape. Despite this success, it is also well known that ...
-
作者:Cattaneo, Matias d.; Chandak, Rajita; Klusowski, Jason m.
作者单位:Princeton University
摘要:We develop a theoretical framework for the analysis of oblique decision trees, where the splits at each decision node occur at linear combinations of the covariates (as opposed to conventional tree constructions that force axisaligned splits involving only a single covariate). While this methodology has garnered significant attention from the computer science and optimization communities since the mid-80s, the advantages they offer over their axisaligned counterparts remain only empirically ju...
-
作者:Dubey, Paromita; Chen, Yaqing; Muller, Hans-Georg
作者单位:University of Southern California; Rutgers University System; Rutgers University New Brunswick; University of California System; University of California Davis
摘要:This article provides an overview on the statistical modeling of complex data as increasingly encountered in modern data analysis. It is argued that such data can often be described as elements of a metric space that satisfies certain structural conditions and features a probability measure. We refer to the random elements of such spaces as random objects and to the emerging field that deals with their statistical analysis as metric statistics. Metric statistics provides methodology, theory an...
-
作者:Kennedy, Edward h.; Balakrishnan, Sivaraman; Robins, James m.; Wasserman, Larry
作者单位:Carnegie Mellon University
摘要:Estimation of heterogeneous causal effects-that is, how effects of poli-cies and treatments vary across subjects-is a fundamental task in causal in-ference. Many methods for estimating conditional average treatment effects(CATEs) have been proposed in recent years, but questions surrounding op-timality have remained largely unanswered. In particular, a minimax theoryof optimality has yet to be developed, with the minimax rate of convergenceand construction of rate-optimal estimators remaining ...
-
作者:Yan, Yuling; Chen, Yuxin; Fan, Jianqing
作者单位:Massachusetts Institute of Technology (MIT); University of Pennsylvania; Princeton University
摘要:This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly underexplored. While computing measures of uncertainty for nonlinear/nonconvex estimators is in general difficult in high dimension, the challenge is further compounded by the prevalent presence of missing data and heteroskedastic noise. We propose a novel approach to performing valid inference on the principal subspace, on the basis of an estimator ca...
-
作者:Cai, T. tony; Kim, Dongwoo; Pu, Hongming
作者单位:University of Pennsylvania
摘要:This paper studies transfer learning for estimating the mean of random functions based on discretely sampled data, where in addition to observations from the target distribution, auxiliary samples from similar but distinct source distributions are available. The paper considers both common and independent designs and establishes the minimax rates of convergence for both designs. The results reveal an interesting phase transition phenomenon under the two designs and demonstrate the benefits of ...
-
作者:Bastian, Patrick; Dette, Holger; Heiny, Johannes
作者单位:Ruhr University Bochum; Stockholm University
摘要:This paper takes a different look on the problem of testing the mutual independence of the components of a high-dimensional vector. Instead of testing if all pairwise associations (e.g., all pairwise Kendall's tau) between the components vanish, we are interested in the (null) hypothesis that all pairwise associations do not exceed a certain threshold in absolute value. The consideration of these hypotheses is motivated by the observation that in the high-dimensional regime, it is rare, and pe...
-
作者:Laha, Nilanjana; Sonabend-w, Aaron; Mukherjee, Rajarshi; Cai, Tianxi
作者单位:Texas A&M University System; Texas A&M University College Station; Harvard University
摘要:Large health care data repositories such as electronic health records (EHR) open new opportunities to derive individualized treatment strategies for complicated diseases such as sepsis. In this paper, we consider the problem of estimating sequential treatment rules tailored to a patient's individual characteristics, often referred to as dynamic treatment regimes (DTRs). Our main objective is to find the optimal DTR that maximizes a discontinuous value function through direct maximization of Fi...
-
作者:Nickl, Richard
作者单位:University of Cambridge
摘要:Let (X-t) be a reflected diffusion process in a bounded convex domain in R-d, solving the stochastic differential equation dX(t) = del f(X-t)dt+root 2f(X-t)dW(t), t >= 0, with W-t a d-dimensional Brownian motion. The data X-0, X-D, ..., X-ND consist of discrete measurements and the time interval D between consecutive observations is fixed so that one cannot 'zoom' into the observed path of the process. The goal is to infer the diffusivity f and the associated transition operator P-t,P-f. We pr...
-
作者:Bradic, Jelena; Ji, Weijie; Zhang, Yuqian
作者单位:University of California System; University of California San Diego; University of California System; University of California San Diego; Shanghai University of Finance & Economics; Renmin University of China
摘要:Estimating dynamic treatment effects is a crucial endeavor in causal inference, particularly when confronted with high-dimensional confounders. Doubly robust (DR) approaches have emerged as promising tools for estimating treatment effects due to their flexibility. However, we showcase that the traditional DR approaches that only focus on the DR representation of the expected outcomes may fall short of delivering optimal results. In this paper, we propose a novel DR representation for intermedi...