-
作者:Weitkamp, Christoph Alexander; Proksch, Katharina; Tameling, Carla; Munk, Axel
作者单位:University of Gottingen; University of Twente; Max Planck Society
摘要:In this article, we aim to provide a statistical theory for object matching based on a lower bound of the Gromov-Wasserstein distance related to the distribution of (pairwise) distances of the considered objects. To this end, we model general objects as metric measure spaces. Based on this, we propose a simple and efficiently computable asymptotic statistical test for pose invariant object discrimination. This is based on a (beta-trimmed) empirical version of the afore-mentioned lower bound. W...
-
作者:Vansteelandt, Stijn; Dukes, Oliver; Van Lancker, Kelly; Martinussen, Torben
作者单位:Ghent University; University of London; London School of Hygiene & Tropical Medicine; University of Copenhagen
摘要:Inference for the conditional association between an exposure and a time-to-event endpoint, given covariates, is routinely based on partial likelihood estimators for hazard ratios indexing Cox proportional hazards models. This approach is flexible and makes testing straightforward, but is nonetheless not entirely satisfactory. First, there is no good understanding of what it infers when the model is misspecified. Second, it is common to employ variable selection procedures when deciding which ...
-
作者:Stoepker, Ivo, V; Castro, Rui M.; Arias-Castro, Ery; van den Heuvel, Edwin
作者单位:Eindhoven University of Technology; University of California System; University of California San Diego; University of California System; University of California San Diego
摘要:Anomaly detection when observing a large number of data streams is essential in a variety of applications, ranging from epidemiological studies to monitoring of complex systems. High-dimensional scenarios are usually tackled with scan-statistics and related methods, requiring stringent modeling assumptions for proper calibration. In this work we take a nonparametric stance, and propose a permutation-based variant of the higher criticism statistic not requiring knowledge of the null distributio...
-
作者:Zeng, Jing; Mai, Qing; Zhang, Xin
作者单位:Chinese Academy of Sciences; University of Science & Technology of China, CAS; State University System of Florida; Florida State University
摘要:Sufficient dimension reduction (SDR) methods target finding lower-dimensional representations of a multivariate predictor to preserve all the information about the conditional distribution of the response given the predictor. The reduction is commonly achieved by projecting the predictor onto a low-dimensional subspace. The smallest such subspace is known as the Central Subspace (CS) and is the key parameter of interest for most SDR methods. In this article, we propose a unified and flexible f...
-
作者:Yin, Mingzhang; Shi, Claudia; Wang, Yixin; Blei, David M.
作者单位:State University System of Florida; University of Florida; Columbia University; University of Michigan System; University of Michigan; Columbia University
摘要:Estimating an individual treatment effect (ITE) is essential to personalized decision making. However, existing methods for estimating the ITE often rely on unconfoundedness, an assumption that is fundamentally untestable with observed data. To assess the robustness of individual-level causal conclusion with unconfoundedness, this paper proposes a method for sensitivity analysis of the ITE, a way to estimate a range of the ITE under unobserved confounding. The method we develop quantifies unme...
-
作者:Shi, Chengchun; Luo, Shikai; Le, Yuan; Zhu, Hongtu; Song, Rui
作者单位:University of London; London School Economics & Political Science; Shanghai University of Finance & Economics; University of North Carolina; University of North Carolina Chapel Hill; North Carolina State University
摘要:We consider reinforcement learning (RL) methods in offline domains without additional online data collection, such as mobile health applications. Most of existing policy optimization algorithms in the computer science literature are developed in online settings where data are easy to collect or simulate. Their generalizations to mobile health applications with a pre-collected offline dataset remain unknown. The aim of this paper is to develop a novel advantage learning framework in order to ef...
-
作者:Ye, Shengbin; Senftle, Thomas P.; Li, Meng
作者单位:Rice University; Rice University
摘要:In the emerging field of materials informatics, a fundamental task is to identify physicochemically meaningful descriptors, or materials genes, which are engineered from primary features and a set of elementary algebraic operators through compositions. Standard practice directly analyzes the high-dimensional candidate predictor space in a linear model; statistical analyses are then substantially hampered by the daunting challenge posed by the astronomically large number of correlated predictor...
-
作者:Yu, Xiufan; Li, Danning; Xue, Lingzhou
作者单位:University of Notre Dame; Northeast Normal University - China; Northeast Normal University - China; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Testing large covariance matrices is of fundamental importance in statistical analysis with high-dimensional data. In the past decade, three types of test statistics have been studied in the literature: quadratic form statistics, maximum form statistics, and their weighted combination. It is known that quadratic form statistics would suffer from low power against sparse alternatives and maximum form statistics would suffer from low power against dense alternatives. The weighted combination met...
-
作者:Hu, Xiaoyu; Yao, Fang
作者单位:National University of Singapore; Peking University
摘要:Principal component analysis is a versatile tool to reduce dimensionality which has wide applications in statistics and machine learning. It is particularly useful for modeling data in high-dimensional scenarios where the number of variables p is comparable to, or much larger than the sample size n. Despite an extensive literature on this topic, researchers have focused on modeling static principal eigenvectors, which are not suitable for stochastic processes that are dynamic in nature. To cha...
-
作者:Billio, Monica; Casarin, Roberto; Iacopini, Matteo
作者单位:Universita Ca Foscari Venezia; Vrije Universiteit Amsterdam
摘要:Modeling time series of multilayer network data is challenging due to the peculiar characteristics of real-world networks, such as sparsity and abrupt structural changes. Moreover, the impact of external factors on the network edges is highly heterogeneous due to edge- and time-specific effects. Capturing all these features results in a very high-dimensional inference problem. A novel tensor-on-tensor regression model is proposed, which integrates zero-inflated logistic regression to deal with...