-
作者:Du, Jin-Hong; Zeng, Zhenghao; Kennedy, Edward H.; Wasserman, Larry; Roeder, Kathryn
作者单位:Carnegie Mellon University; Carnegie Mellon University; Carnegie Mellon University
摘要:With the evolution of single-cell RNA sequencing techniques into a standard approach in genomics, it has become possible to conduct cohort-level causal inferences based on single-cell-level measurements. However, the individual gene expression levels of interest are not directly observable; instead, only repeated proxy measurements from each individual's cells are available, providing a derived outcome to estimate the underlying outcome for each of many genes. In this article, we propose a gen...
-
作者:Xie, Yiling; Huo, Xiaoming
作者单位:University System of Georgia; Georgia Institute of Technology
摘要:Adversarial training has been proposed to protect machine learning models against adversarial attacks. This article focuses on adversarial training under l(infinity)-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the asymptotic distribution of the adversarial training estimator under l(infinity)-perturbation could put a positive probability ...
-
作者:Dewaskar, Miheer; Tosh, Christopher; Knoblauch, Jeremias; Dunson, David B.
作者单位:University of New Mexico; Memorial Sloan Kettering Cancer Center; University of London; University College London; Duke University
摘要:Likelihood-based inferences have been remarkably successful in wide-spanning application areas. However, even after due diligence in selecting a good model for the data at hand, there is inevitably some amount of model misspecification: outliers, data contamination or inappropriate parametric assumptions such as Gaussianity mean that most models are at best rough approximations of reality. A significant practical concern is that for certain inferences, even small amounts of model misspecificat...
-
作者:Gomez, Jose A. Sanchez; Mo, Weibin; Zhao, Junlong; Liu, Yufeng
作者单位:University of California System; University of California Riverside; Purdue University System; Purdue University; Beijing Normal University; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine
摘要:Graphical models are popular tools for exploring relationships among a set of variables. The Gaussian graphical model (GGM) is an important class of graphical models, where the conditional dependence among variables is represented by nodes and edges in a graph. In many real applications, we are interested in detecting hubs in graphical models, which refer to nodes with a significant higher degree of connectivity compared to non-hub nodes. A typical strategy for hub detection consists of estima...
-
作者:Graziani, Carlo
作者单位:United States Department of Energy (DOE); Argonne National Laboratory
-
作者:Chandra, Noirrit Kiran; Dunson, David B.; Xu, Jason
作者单位:University of Texas System; University of Texas Dallas; Duke University
摘要:Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under different conditions, for instance in reproducing studies across research groups. In such cases, it is natural to seek to learn the shared versus condition-specific structure. Existing hierarchical extensions of factor analysis have been proposed, but face practical issues includi...
-
作者:Qiu, Jiaxin; Li, Zeng; Yao, Jianfeng
作者单位:University of Hong Kong; Southern University of Science & Technology; The Chinese University of Hong Kong, Shenzhen
摘要:Determining the number of factors in high-dimensional factor modeling is essential but challenging, especially when the data are heavy-tailed. In this article, we introduce a new estimator based on the spectral properties of Spearman sample correlation matrix under the high-dimensional setting, where both dimension and sample size tend to infinity proportionally. Our estimator is robust against heavy tails in either the common factors or idiosyncratic errors. The consistency of our estimator i...
-
作者:Frazier, David T.; Nott, David J.; Drovandi, Christopher
作者单位:Monash University; National University of Singapore; National University of Singapore; Queensland University of Technology (QUT)
摘要:Bayesian synthetic likelihood is a widely used approach for conducting Bayesian analysis in complex models where evaluation of the likelihood is infeasible but simulation from the assumed model is tractable. We analyze the behavior of the Bayesian synthetic likelihood posterior when the assumed model differs from the actual data generating process. We demonstrate that the Bayesian synthetic likelihood posterior can display a wide range of nonstandard behaviors depending on the level of model m...
-
作者:Tang, Weijing; Zhu, Ji
作者单位:Carnegie Mellon University; University of Michigan System; University of Michigan
摘要:Statistical network models are useful for understanding the underlying formation mechanism and characteristics of complex networks. However, statistical models for signed networks have been largely unexplored. In signed networks, there exist both positive (e.g., like, trust) and negative (e.g., dislike, distrust) edges, which are commonly seen in real-world scenarios. The positive and negative edges in signed networks lead to unique structural patterns, which pose challenges for statistical mo...
-
作者:Yu, Shan; Wang, Guannan; Wang, Li
作者单位:University of Virginia; William & Mary; George Mason University
摘要:Spatial heterogeneity is of great importance in social, economic, and environmental science studies. The spatially varying coefficient model is a popular and effective spatial regression technique to address spatial heterogeneity. However, accounting for heterogeneity comes at the cost of reducing model parsimony. To balance flexibility and parsimony, this article develops a class of generalized partially linear spatially varying coefficient models which allow the inclusion of both constant an...