-
作者:Da Silva, D. N.; Skinner, C. J.
作者单位:University of London; London School Economics & Political Science
摘要:Paradata refers to survey variables which are not of direct interest themselves, but are related to the quality of data on survey variables which are of interest. We focus on a categorical paradata variable, which reflects the presence of measurement error in a variable of interest. We propose a quasi-score test of the hypothesis of no measurement error bias in the estimation of regression coefficients under models for paradata. We also propose a regression-based test, analogous to a simple te...
-
作者:Fang, Junhan; Yi, Grace Y.
作者单位:University of Waterloo; Western University (University of Western Ontario)
摘要:Measurement error in covariates has been extensively studied in many conventional regression settings where covariate information is typically expressed in a vector form. However, there has been little work on error-prone matrix-variate data, which commonly arise from studies with imaging, spatial-temporal structures, etc. We consider analysis of error-contaminated matrix-variate data. We particularly focus on matrix-variate logistic measurement error models. We examine the biases induced from...
-
作者:Rotnitzky, A.; Smucler, E.; Robins, J. M.
作者单位:Universidad Torcuato Di Tella; Universidad Torcuato Di Tella; Harvard University; Harvard T.H. Chan School of Public Health
摘要:We study a class of parameters with the so-called mixed bias property. For parameters with this property, the bias of the semiparametric efficient one-step estimator is equal to the mean of the product of the estimation errors of two nuisance functions. In nonparametric models, parameters with the mixed bias property admit so-called rate doubly robust estimators, i.e., estimators that are consistent and asymptotically normal when one succeeds in estimating both nuisance functions at sufficient...
-
作者:Wang, Shulei; Cai, T. Tony; Li, Hongzhe
作者单位:University of Pennsylvania; University of Pennsylvania
摘要:Quantitative comparison of microbial composition from different populations is a fundamental task in various microbiome studies. We consider two-sample testing for microbial compositional data by leveraging phylogenetic information. Motivated by existing phylogenetic distances, we take a minimum-cost flow perspective to study such testing problems. We first show that multivariate analysis of variance with permutation using phylogenetic distances, one of the most commonly used methods in practi...
-
作者:Zhang, Ting
作者单位:Boston University
摘要:Quantile regression is a popular and powerful method for studying the effect of regressors on quantiles of a response distribution. However, existing results on quantile regression were mainly developed for cases in which the quantile level is fixed, and the data are often assumed to be independent. Motivated by recent applications, we consider the situation where (i) the quantile level is not fixed and can grow with the sample size to capture the tail phenomena, and (ii) the data are no longe...
-
作者:Lin, Zhenhua; Yao, Fang
作者单位:National University of Singapore; Peking University
摘要:We propose a new method for functional nonparametric regression with a predictor that resides on a finite-dimensional manifold, but is observable only in an infinite-dimensional space. Contamination of the predictor due to discrete or noisy measurements is also accounted for. By using functional local linear manifold smoothing, the proposed estimator enjoys a polynomial rate of convergence that adapts to the intrinsic manifold dimension and the contamination level. This is in contrast to the l...
-
作者:Sun, Xiaoxiao; Zhong, Wenxuan; Ma, Ping
作者单位:University of Arizona; University System of Georgia; University of Georgia
摘要:Large samples are generated routinely from various sources. Classic statistical models, such as smoothing spline ANOVA models, are not well equipped to analyse such large samples because of high computational costs. In particular, the daunting computational cost of selecting smoothing parameters renders smoothing spline ANOVA models impractical. In this article, we develop an asympirical, i.e., asymptotic and empirical, smoothing parameters selection method for smoothing spline ANOVA models in...
-
作者:Ma, Huijuan; Peng, Limin; Huang, Chiung-Yu; Fu, Haoda
作者单位:East China Normal University; Emory University; University of California System; University of California San Francisco; Eli Lilly; Lilly Research Laboratories
摘要:Progression of chronic disease is often manifested by repeated occurrences of disease-related events over time. Delineating the heterogeneity in the risk of such recurrent events can provide valuable scientific insight for guiding customized disease management. We propose a new sensible measure of individual risk of recurrent events and present a dynamic modelling framework thereof, which accounts for both observed covariates and unobservable frailty. The proposed modelling requires no distrib...
-
作者:Ma, Rong; Barnett, Ian
作者单位:University of Pennsylvania
摘要:Modularity is a popular metric for quantifying the degree of community structure within a network. The distribution of the largest eigenvalue of a network's edge weight or adjacency matrix is well studied and is frequently used as a substitute for modularity when performing statistical inference. However, we show that the largest eigenvalue and modularity are asymptotically uncorrelated, which suggests the need for inference directly on modularity itself when the network is large. To this end,...
-
作者:Chang, Jinyuan; Chen, Song Xi; Tang, Cheng Yong; Wu, Tong Tong
作者单位:Southwestern University of Finance & Economics - China; Peking University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University; University of Rochester
摘要:High-dimensional statistical inference with general estimating equations is challenging and remains little explored. We study two problems in the area: confidence set estimation for multiple components of the model parameters, and model specifications tests. First, we propose to construct a new set of estimating equations such that the impact from estimating the high-dimensional nuisance parameters becomes asymptotically negligible. The new construction enables us to estimate a valid confidenc...