-
作者:Zhu, Hongtu; Zhang, Heping; Ibrahim, Joseph G.; Peterson, Bradley S.
作者单位:University of North Carolina; University of North Carolina Chapel Hill; Yale University; University of North Carolina; University of North Carolina Chapel Hill; New York State Psychiatry Institute; Columbia University
-
作者:Gelman, Andrew; Fagan, Jeffrey; Kiss, Alex
作者单位:Columbia University; Columbia University; Columbia University; Columbia University; University of Toronto; Sunnybrook Health Science Center; Sunnybrook Research Institute
摘要:Recent studies by police departments and researchers confirm that police stop persons of racial and ethnic minority groups more often than whites relative to their proportions in the population. However, it has been argued that stop rates more accurately reflect rates of crimes committed by each ethnic group, or that stop rates reflect elevated rates in specific social areas, such as neighborhoods or precincts. Most of the research on stop rates and police-citizen interactions has focused on t...
-
作者:Wu, Yujun; Boos, Dennis D.; Stefanski, Leonard A.
作者单位:Rutgers University System; Rutgers University New Brunswick; Rutgers University Biomedical & Health Sciences; North Carolina State University
摘要:We propose a new approach to variable selection designed to control the false selection rate (FSR), defined as the proportion of uninformative variables included in selected models. The method works by adding a known number of pseudovariables to the real dataset, running a variable selection procedure, and monitoring the proportion of pseudovariables falsely selected. Information obtained from bootstrap-like replications of this process is used to estimate the proportion of falsely selected re...
-
作者:Wang, Huixia; He, Xuming
作者单位:North Carolina State University; University of Illinois System; University of Illinois Urbana-Champaign
摘要:In this article we consider testing for differentially expressed genes in GeneChip studies by modeling and analyzing the quantiles of gene expression through probe level measurements. By developing a robust rank score test for linear quantile models with a random effect, we propose a reliable test for detecting differences in certain quantiles of the intensity distributions. By using a genomewide adjustment to the test statistic to account for within-array correlation, we demonstrate that the ...
-
作者:Chan, Hock Peng; Zhang, Nancy Ruonan
作者单位:National University of Singapore; Stanford University
摘要:We examine scan statistics for one-dimensional marked Poisson processes. Such statistics tabulate the maximum weighted count of event occurrences within a window of predetermined width over all windows within an observed interval. We derive analytical formulas and also give an importance sampling method for approximating the tail probabilities of scan statistics. Because high-throughput genomic sequencing has led to the availability of massive amounts of biomolecular sequence data, it is often...
-
作者:Bortot, P.; Coles, S. G.; Sisson, S. A.
作者单位:University of Bologna; University of Padua; University of New South Wales Sydney
摘要:In the production of clean steels, the occurrence of imperfections-so-called inclusions-is unavoidable. The strength of a clean steel block is largely dependent on the size of the largest imperfection that it contains, so inference on extreme inclusion size forms an important part of quality control. Sampling is generally done by measuring imperfections on planar slices, leading to an extreme value version of a standard stereological problem: how to make inference on large inclusions using onl...
-
作者:Naik, Prasad A.; Shi, Peide; Tsai, Chih-Ling
作者单位:University of California System; University of California Davis; Peking University
摘要:We examine the problem of jointly selecting the number of components and variables in finite mixture regression models. We find that the Akaike information criterion is unsatisfactory for this purpose because it overestimates the number of components, which in turn results in incorrect variables being retained in the model. Therefore, we derive a new information criterion, the mixture regression criterion (MRC), that yields marked improvement in model selection due to what we call the clusteri...
-
作者:Sahu, Sujit K.; Gelfand, Alan E.; Holland, David M.
作者单位:University of Southampton; Duke University; United States Environmental Protection Agency
摘要:This article proposes a space-time model for daily 8-hour maximum ozone levels to provide input for regulatory activities: detection, evaluation, and analysis of spatial patterns and temporal trend in ozone summaries. The model is applied to the analysis of data from the state of Ohio that contains a mix of urban, suburban, and rural ozone monitoring sites. The proposed space-time model is autoregressive and incorporates the most important meteorological variables observed at a collection of o...
-
作者:Gupta, Mayetri; Ibrahim, Joseph G.
作者单位:University of North Carolina; University of North Carolina Chapel Hill
摘要:The profusion of genomic data through genome sequencing and gene expression microarray technology has facilitated statistical research in determining gene interactions regulating a biological process. Current methods generally consist of a two-stage procedure: clustering gene expression measurements and searching for regulatory switches, typically short, conserved sequence patterns (motifs) in the DNA sequence adjacent to the genes. This process often leads to misleading conclusions as incorre...
-
作者:Fan, Jianqing; Huang, Tao; Li, Runze
作者单位:Princeton University; University of Virginia; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Improving efficiency for regression coefficients and predicting trajectories of individuals are two important aspects in the analysis of longitudinal data. Both involve estimation of the covariance function. Yet challenges arise in estimating the covariance function of longitudinal data collected at irregular time points. A class of semiparametric models for the covariance function by that imposes a parametric correlation structure while allowing a nonparametric variance function is proposed. ...