-
作者:Wang, Lu; Rotnitzky, Andrea; Lin, Xihong
作者单位:University of Michigan System; University of Michigan; Harvard University; Harvard T.H. Chan School of Public Health; Universidad Torcuato Di Tella
摘要:We consider nonparametric regression of a scalar outcome on a covariate when the outcome is missing at random (MAR) given the covariate and other observed auxiliary variables. We propose a class of augmented inverse probability weighted (AIPW) kernel estimating equations for nonparametric regression under MAR. We show that AIPW kernel estimators are consistent when the probability that the outcome is observed, that is, the selection probability, is either known by design or estimated under a c...
-
作者:Holan, Scott H.; Toth, Daniell; Ferreira, Marco A. R.; Karr, Alan F.
作者单位:University of Missouri System; University of Missouri Columbia; United States Department of Labor
摘要:Many scientific, sociological, and economic applications present data that are collected on multiple scales of resolution. One particular form of multiscale data arises when data are aggregated across different scales both longitudinally and by economic sector. Frequently, such datasets experience missing observations in a manner that they can be accurately imputed, while respecting the constraints imposed by the multiscale nature of the data, using the method we propose known as Bayesian mult...
-
作者:Cerioli, Andrea
作者单位:University of Parma
摘要:In this paper we develop multivariate outlier tests based on the high-breakdown Minimum Covariance Determinant estimator The rules that we propose have good performance under the null hypothesis of no outliers in the data and also appreciable power properties for the purpose of individual outlier detection This achievement is made possible by two orders of improvement over the currently available methodology First we suggest an approximation to the exact distribution of robust distances flour ...
-
作者:Chatterjee, Nilanjan; Li, Yan
作者单位:National Institutes of Health (NIH) - USA; NIH National Cancer Institute (NCI); NIH National Cancer Institute- Division of Cancer Epidemiology & Genetics; University of Texas System; University of Texas Arlington
摘要:In epidemiologic studies, partial questionnaire design (PQD) can reduce cost, time, and other practical burdens associated with lengthy questionnaires by assigning different subsets of the questionnaire to different, but overlapping, subsets of the study participants. In this article, we describe methods for semiparametric inference for regression model under PQD and other study settings that can generate nonmonotone missing data in covariates. In particular, motivated from methods for multiph...
-
作者:Shen, Xiaotong; Huang, Hsin-Cheng
作者单位:University of Minnesota System; University of Minnesota Twin Cities; Academia Sinica - Taiwan
摘要:Extracting grouping structure or identifying homogenous subgroups of predictors in regression is crucial for high-dimensional data analysis. A low-dimensional structure in particular-grouping, when captured in a regression model-enables to enhance predictive performance and to facilitate a model's interpretability. Grouping pursuit extracts homogenous subgroups of predictors most responsible for outcomes of a response. This is the case in gene network analysis, where grouping reveals gene func...
-
作者:Crainiceanu, Ciprian M.; Staicu, Ana-Maria; Di, Chong-Zhi
作者单位:Johns Hopkins University; North Carolina State University; Fred Hutchinson Cancer Center
摘要:We introduce Generalized Multilevel Functional Linear Models (GMFLMs), a novel statistical framework for regression models where exposure has a multilevel functional structure. We show that GMFLMs are, in fact, generalized multilevel mixed models. Thus, GMFLMs can be analyzed using the mixed effects inferential machinery and can be generalized within a well-researched statistical framework. We propose and compare two methods for inference: (1) a two-stage frequentist approach: and (2) a joint ...
-
作者:Xu, Qiang; Paik, Myunghee Cho; Luo, Xiaodong; Tsai, Wei-Yann
作者单位:US Food & Drug Administration (FDA); Columbia University; Icahn School of Medicine at Mount Sinai
摘要:Missingness in covariates is a common problem in survival data. In this article we propose a reweighting method for estimating the regression parameters in the Cox model with missing covariates. We also consider the augmented reweighting method by subtracting the projection term onto the nuisance tangent space. The proposed method provides consistent and asymptotically normally distributed estimators when the missing-data mechanism depends on the outcome variables, its well as on the observed ...
-
作者:Gill, Jeff; Casella, George
作者单位:Washington University (WUSTL); State University System of Florida; University of Florida
摘要:A generalized linear mixed model, ordered probit, is used to estimate levels of stress in presidential political appointees as a means of understanding their surprisingly short tenures. A Bayesian approach is developed, where the random effects are modeled with a Dirichlet process mixture prior, allowing for useful incorporation of prior information, but retaining some vagueness in the form of the prior. Applications of Bayesian models in the social sciences are typically done with uninformati...
-
作者:Rios Insua, Insua; Rios, Jesus; Banks, David
作者单位:Universidad Rey Juan Carlos; Duke University
摘要:Applications in counterterrorism and corporate competition have led to the development of new methods for the analysis of decision making when there are intelligent opponents and uncertain outcomes. This field represents a combination of statistical risk analysis and game theory, and is sometimes called adversarial risk analysis. In this article, we describe several formulations of adversarial risk problems, and provide a framework that extends traditional risk analysis' tools, such as influen...
-
作者:Fryzlewicz, Piotr; Ombao, Hernando
作者单位:University of Bristol; Brown University
摘要:Consider the situation when we have training data containing many time series having known group membership and testing data with unknown group membership. The goals are to find timescale features (using training data) that can best separate the groups, and to use these highly discriminant features to classify test data. We propose a method for classification using a bias-corrected nondecimated wavelet transform. Wavelets are ideal for identifying highly discriminant local time and scale featu...