-
作者:Kim, Seyoung; Xing, Eric P.
作者单位:Carnegie Mellon University
摘要:We consider the problem of estimating a sparse multi-response regression function, with an application to expression quantitative trait locus (eQTL) mapping, where the goal is to discover genetic variations that influence gene-expression levels. In particular, we investigate a shrinkage technique capable of capturing a given hierarchical structure over the responses, such as a hierarchical clustering tree with leaf nodes for responses and internal nodes for clusters of related responses at mul...
-
作者:Li, Shaoyu; Cui, Yuehua
作者单位:Michigan State University; St Jude Children's Research Hospital
摘要:Much of the natural variation for a complex trait can be explained by variation in DNA sequence levels. As part of sequence variation, gene-gene interaction has been ubiquitously observed in nature, where its role in shaping the development of an organism has been broadly recognized. The identification of interactions between genetic factors has been progressively pursued via statistical or machine learning approaches. A large body of currently adopted methods, either parametrically or nonpara...
-
作者:Wang, Yong; Ziedins, Ilze; Holmes, Mark; Challands, Neil
作者单位:University of Auckland
摘要:A new family of tree models is proposed, which we call differential trees. A differential tree model is constructed from multiple data sets and aims to detect distributional differences between them. The new methodology differs from the existing difference and change detection techniques in its nonparametric nature, model construction from multiple data sets, and applicability to high-dimensional data. Through a detailed study of an arson case in New Zealand, where an individual is known to ha...
-
作者:Yan, Donghui; Wang, Pei; Linden, Michael; Knudsen, Beatrice; Randolph, Timothy
作者单位:Fred Hutchinson Cancer Center; University of Minnesota System; University of Minnesota Twin Cities; Cedars Sinai Medical Center
摘要:Recent advances in tissue microarray technology have allowed immunohistochemistry to become a powerful medium-to-high throughput analysis tool, particularly for the validation of diagnostic and prognostic biomarkers. However, as study size grows, the manual evaluation of these assays becomes a prohibitive limitation; it vastly reduces throughput and greatly increases variability and expense. We propose an algorithm-Tissue Array Co-Occurrence Matrix Analysis (TACOMA)-for quantifying cellular ph...
-
作者:Voulgaraki, Anastasia; Kedem, Benjamin; Graubard, Barry I.
作者单位:University System of Maryland; University of Maryland College Park; National Institutes of Health (NIH) - USA; NIH National Cancer Institute (NCI)
摘要:It is possible to approach regression analysis with random covariates from a semiparametric perspective where information is combined from multiple multivariate sources. The approach assumes a semiparametric density ratio model where multivariate distributions are regressed on a reference distribution. A kernel density estimator can be constructed from many data sources in conjunction with the semiparametric model. The estimator is shown to be more efficient than the traditional single-sample ...