-
作者:Zhou, Tianjian; Sengupta, Subhajit; Muller, Peter; Ji, Yuan
作者单位:University of Chicago; NorthShore University Health System; University of Texas System; University of Texas Austin
摘要:Tumor cell population consists of genetically heterogeneous subpopulations, known as subclones. Bulk sequencing data using high-throughput sequencing technology provide total and variant DNA and RNA read counts for many nucleotide loci as a mixture of signals from different subclones. We present RNDClone as a tool to deconvolute the mixture and reconstruct the subclones with distinct DNA genotypes and RNA expression profiles. In particular, we infer the number and population frequencies of sub...
-
作者:Bhat, K. Sham; Myers, Kary; Lawrence, Earl; Colgan, James; Judge, Elizabeth
作者单位:United States Department of Energy (DOE); Los Alamos National Laboratory; United States Department of Energy (DOE); Los Alamos National Laboratory; United States Department of Energy (DOE); Los Alamos National Laboratory
摘要:The Mars rover Curiosity carries an instrument called ChemCam to determine the composition of the soil and rocks via laser-induced breakdown spectroscopy (LIBS). Los Alamos National Laboratory has developed a simulation capability that can predict spectra from ChemCam, but there are majorscale differences between the prediction and observation. This presents a challenge when using Bayesian model calibration to determine the unknown physical parameters that describe the LIBS observations. We pr...
-
作者:Gao, Yuanjun; Goetz, Jack; Connelly, Matthew; Mazumder, Rahul
作者单位:Columbia University; University of Michigan System; University of Michigan; Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Columbia University
摘要:Since 1973, the U.S. State Department has been using electronic record systems to preserve classified communications. Recently, approximately 1.9 million of these records from 1973-77 have been made available by the U.S. National Archives. While some of these communication streams have periods witnessing an acceleration in the rate of transmission, others do not show any notable patterns in communication intensity. Given the sheer volume of these communications, far greater than what had been ...
-
作者:Li, Qian; Shamshoian, John; Senturk, Damla; Sugar, Catherine; Jeste, Shafali; DiStefano, Charlotte; Telesca, Donatello
作者单位:University of California System; University of California Los Angeles; University of California System; University of California Los Angeles; University of California Los Angeles Medical Center; David Geffen School of Medicine at UCLA; University of California System; University of California Los Angeles; University of California Los Angeles Medical Center; David Geffen School of Medicine at UCLA
摘要:Functional brain imaging through electroencephalography (EEG) relies upon the analysis and interpretation of high-dimensional, spatially organized time series. We propose to represent time-localized frequency domain characterizations of EEG data as region-referenced functional data. This representation is coupled with a hierarchical regression modeling approach to multivariate functional observations. Within this familiar setting we discuss how several prior models relate to structural assumpt...
-
作者:Ferrari, Federico; Dunson, David B.
作者单位:Duke University
摘要:This article is motivated by the problem of studying the joint effect of different chemical exposures on human health outcomes. This is essentially a nonparametric regression problem, with interest being focused not on a black box for prediction but instead on selection of main effects and interactions. For interpretability we decompose the expected health outcome into a linear main effect, pairwise interactions and a nonlinear deviation. Our interest is in model selection for these different ...
-
作者:Hannaford, Naomi E.; Heaps, Sarah E.; Nye, Tom M. W.; Williams, Tom A.; Embley, T. Martin
作者单位:Newcastle University - UK; University of Bristol; Newcastle University - UK
摘要:Phylogenetics uses alignments of molecular sequence data to learn about evolutionary trees. Substitutions in sequences are modelled through a continuous-time Markov process, characterised by an instantaneous rate matrix, which standard models assume is time-reversible and stationary. These assumptions are biologically questionable and induce a likelihood function which is invariant to a tree's root position. This hampers inference because a tree's biological interpretation depends critically o...
-
作者:Terada, Yoshikazu; Ogasawara, Issei; Nakata, Ken
作者单位:University of Osaka; RIKEN; University of Osaka
摘要:In various fields, data recorded continuously during a time interval and curve data, such as spectral data, become common. These kinds of data can be interpreted as functional data. In this paper we have studied binary classification from only positive and unlabeled functional data (PU classification for functional data). Our first contribution is to present a simple classification algorithm for this problem. The key feature of the algorithm is that it is not required an estimation of the unkn...
-
作者:Fisher, Jared D.; Puelz, David W.; Carvalho, Carlos M.
作者单位:University of California System; University of California Berkeley; University of Chicago; University of Texas System; University of Texas Austin
摘要:This paper considers the problem of modeling a firm's expected return as a nonlinear function of its observable characteristics. We investigate whether theoretically-motivated monotonicity constraints on characteristics and non-stationarity of the conditional expectation function provide statistical and economic benefit. We present an interpretable model that has similar out-of-sample performance to black-box machine learning methods. With this model, the data provide support for monotonicity ...
-
作者:Mohler, George; McGrath, Erin; Buntain, Cody; LaFree, Gary
作者单位:Purdue University System; Purdue University; Purdue University in Indianapolis; University System of Maryland; University of Maryland College Park; New Jersey Institute of Technology; University System of Maryland; University of Maryland College Park
摘要:We consider the problem of modeling and clustering heterogeneous event data arising from coupled conflict event and social media data sets. In this setting conflict events trigger responses on social media, and, at the same time, signals of grievance detected in social media may serve as leading indicators for subsequent conflict events. For this purpose we introduce the Hawkes Binomial Topic Model (HBTM) where marks, Tweets and conflict event descriptions are represented as bags of words foll...
-
作者:Cheng, David; Ayyagari, Rajeev; Signorovitch, James
作者单位:Harvard University; Harvard University Medical Affiliates; Massachusetts General Hospital; Analysis Group Inc.
摘要:Indirect comparisons of treatment-specific outcomes across separate studies often inform decision making in the absence of head-to-head randomized comparisons. Differences in baseline characteristics between study populations may introduce confounding bias in such comparisons. Matching-adjusted indirect comparison (MAIC) (Pharmacoeconomics 28 (2010) 935-945) has been used to adjust for differences in observed baseline covariates when the individual patient-level data (IPD) are available for on...