-
作者:Ferrari, Federico; Dunson, David B.
作者单位:Duke University
摘要:This article is motivated by the problem of studying the joint effect of different chemical exposures on human health outcomes. This is essentially a nonparametric regression problem, with interest being focused not on a black box for prediction but instead on selection of main effects and interactions. For interpretability we decompose the expected health outcome into a linear main effect, pairwise interactions and a nonlinear deviation. Our interest is in model selection for these different ...
-
作者:Hannaford, Naomi E.; Heaps, Sarah E.; Nye, Tom M. W.; Williams, Tom A.; Embley, T. Martin
作者单位:Newcastle University - UK; University of Bristol; Newcastle University - UK
摘要:Phylogenetics uses alignments of molecular sequence data to learn about evolutionary trees. Substitutions in sequences are modelled through a continuous-time Markov process, characterised by an instantaneous rate matrix, which standard models assume is time-reversible and stationary. These assumptions are biologically questionable and induce a likelihood function which is invariant to a tree's root position. This hampers inference because a tree's biological interpretation depends critically o...
-
作者:Trang Quynh Nguyen; Stuart, Elizabeth A.
作者单位:Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health; Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health
-
作者:Godichon-Baggioni, Antoine; Maugis-Rabusseau, Cathy; Rau, Andrea
作者单位:Universite Paris Cite; Sorbonne Universite; Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI); Universite Federale Toulouse Midi-Pyrenees (ComUE); Universite de Toulouse; Institut National des Sciences Appliquees de Toulouse; Universite Toulouse III - Paul Sabatier; Universite Paris Saclay; INRAE; AgroParisTech
摘要:Multiview data, which represent distinct but related groupings of variables, can be useful for identifying relevant and robust clustering structures among observations. A large number of multiview classification algorithms have been proposed in the fields of computer science and genomics; here, we instead focus on the task of merging or splitting an existing hard or soft cluster partition based on multiview data. This article is specifically motivated by an application involving multiomic brea...
-
作者:Chakraborty, Arnab; Lahiri, Soumendra Nath; Wilson, Alyson
作者单位:North Carolina State University
摘要:Spatial prediction of weather elements like temperature, precipitation, and barometric pressure are generally based on satellite imagery or data collected at ground stations. None of these data provide information at a more granular or hyperlocal resolution. On the other hand, crowdsourced weather data, which are captured by sensors installed on mobile devices and gathered by weather-related mobile apps like Weather Signal and AccuWeather, can serve as potential data sources for analyzing envi...
-
作者:Elmasri, Mohamad; Farrell, Maxwell J.; Davies, T. Jonathan; Stephens, David A.
作者单位:McGill University; McGill University; University of British Columbia; University of British Columbia
摘要:Identifying undocumented or potential future interactions among species is a challenge facing modern ecologists. Recent link prediction methods rely on trait data; however, large species interaction databases are typically sparse and covariates are limited to only a fraction of species. On the other hand, evolutionary relationships, encoded as phylogenetic trees, can act as proxies for underlying traits and historical patterns of parasite sharing among hosts. We show that, using a network-base...
-
作者:Terada, Yoshikazu; Ogasawara, Issei; Nakata, Ken
作者单位:University of Osaka; RIKEN; University of Osaka
摘要:In various fields, data recorded continuously during a time interval and curve data, such as spectral data, become common. These kinds of data can be interpreted as functional data. In this paper we have studied binary classification from only positive and unlabeled functional data (PU classification for functional data). Our first contribution is to present a simple classification algorithm for this problem. The key feature of the algorithm is that it is not required an estimation of the unkn...
-
作者:Antonelli, Joseph; Mazumdar, Maitreyi; Bellinger, David; Christiani, David; Wright, Robert; Coull, Brent
作者单位:State University System of Florida; University of Florida; Harvard University; Harvard Medical School; Harvard University; Harvard T.H. Chan School of Public Health; Icahn School of Medicine at Mount Sinai
摘要:Humans are routinely exposed to mixtures of chemical and other environmental factors, making the quantification of health effects associated with environmental mixtures a critical goal for establishing environmental policy sufficiently protective of human health. The quantification of the effects of exposure to an environmental mixture poses several statistical challenges. It is often the case that exposure to multiple pollutants interact with each other to affect an outcome. Further, the expo...
-
作者:Diana, Alex; Matechou, Eleni; Griffin, Jim; Johnston, Alison
作者单位:University of Kent; University of London; University College London; Cornell University
摘要:Environmental changes in recent years have been linked to phenological shifts which in turn are linked to the survival of species. The work in this paper is motivated by capture-recapture data on blackcaps collected by the British Trust for Ornithology as part of the Constant Effort Sites monitoring scheme. Blackcaps overwinter abroad and migrate to the UK annually for breeding purposes. We propose a novel Bayesian nonparametric approach for expressing the bivariate density of individual arriv...
-
作者:Fisher, Jared D.; Puelz, David W.; Carvalho, Carlos M.
作者单位:University of California System; University of California Berkeley; University of Chicago; University of Texas System; University of Texas Austin
摘要:This paper considers the problem of modeling a firm's expected return as a nonlinear function of its observable characteristics. We investigate whether theoretically-motivated monotonicity constraints on characteristics and non-stationarity of the conditional expectation function provide statistical and economic benefit. We present an interpretable model that has similar out-of-sample performance to black-box machine learning methods. With this model, the data provide support for monotonicity ...