-
作者:MacDonald, Iain L.
作者单位:University of Cape Town
摘要:The problem of fitting a folded normal distribution by maximum likelihood has been described as 'not straightforward', and alternatives such as EM proposed. We suggest here that it is in fact straightforward to fit such a distribution by direct numerical maximization of the likelihood. We demonstrate this in an example. The relevant R code is included.
-
作者:Schliep, Erin M.; Collins, Sarah M.; Rojas-Salazar, Shirley; Lottig, Noah R.; Stanley, Emily H.
作者单位:University of Missouri System; University of Missouri Columbia; University of Wyoming; University of Wisconsin System; University of Wisconsin Madison
摘要:Concentrations of nitrogen provide a critical metric for understanding ecosystem function and water quality in lakes. However, varying approaches for quantifying nitrogen concentrations may bias the comparison of water quality across lakes and regions. Different measurements of total nitrogen exist based on its composition (e.g., organic versus inorganic, dissolved versus particulate), which we refer to as nitrogen species. Fortunately, measurements of multiple nitrogen species are often colle...
-
作者:Fuglstad, Geir-Arne; Castruccio, Stefano
作者单位:Norwegian University of Science & Technology (NTNU); University of Notre Dame
摘要:Modern climate models pose an ever-increasing storage burden to computational facilities, and the upcoming generation of global simulations from the next Intergovernmental Panel on Climate Change will require a substantial share of the budget of research centers worldwide to be allocated just for this task. A statistical model can be used as a means to mitigate the storage burden by providing a stochastic approximation of the climate simulations. Indeed, if a suitably validated statistical mod...
-
作者:Park, Seyoung; Zhao, Hongyu
作者单位:Sungkyunkwan University (SKKU); Yale University
摘要:Principal component analysis (PCA) is a commonly used statistical method in a wide range of applications. However, it does not work well when the number of features is larger than the sample size. We consider the estimation of the sparse principal subspace in the high dimensional setting with missing data motivated by the analysis of single-cell RNA sequence data. We propose a two step estimation procedure, and establish the rates of convergence for estimating the principal subspace. Simulated...
-
作者:D'Angelo, Silvia; Murphy, Thomas Brendan; Alfo, Marco
作者单位:Sapienza University Rome; University College Dublin
摘要:The Eurovision Song Contest is a popular TV singing competition held annually among country members of the European Broadcasting Union. In this competition, each member can be both contestant and jury, as it can participate with a song and/or vote for other countries' tunes. During the years, the voting system has repeatedly been accused of being biased by tactical voting; votes would represent strategic interests rather than actual musical preferences of the voting countries. In this work, we...
-
作者:Relion, Jesus D. Arroyo; Kessler, Daniel; Levina, Elizaveta; Taylor, Stephan F.
作者单位:Johns Hopkins University; University of Michigan System; University of Michigan; University of Michigan System; University of Michigan
摘要:While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has sh...
-
作者:Bertolacci, Michael; Cripps, Edward; Rosen, Ori; Lau, John W.; Cripps, Sally
作者单位:University of Western Australia; University of Sydney; University of Texas System; University of Texas El Paso
摘要:Daily precipitation has an enormous impact on human activity, and the study of how it varies over time and space, and what global indicators influence it, is of paramount importance to Australian agriculture. We analyze over 294 million daily rainfall measurements since 1876, spanning 17,606 sites across continental Australia. The data are not only large but also complex, and the topic would benefit from a common and publicly available statistical framework. We propose a Bayesian hierarchical ...
-
作者:Marino, Maria Francesca; Ranalli, Maria Giovanna; Salvati, Nicola; Alfo, Marco
作者单位:University of Florence; University of Perugia; University of Pisa
摘要:The Italian National Institute for Statistics regularly provides estimates of unemployment indicators using data from the labor force survey. However, direct estimates of unemployment incidence cannot be released for local labor market areas. These are unplanned domains defined as clusters of municipalities; many are out-of-sample areas, and the majority is characterized by a small sample size which renders direct estimates inadequate. The empirical best predictor represents an appropriate, mo...
-
作者:Dinsdale, Daniel; Salibian-barrera, Matias
作者单位:University of British Columbia
摘要:In the last 25 years there has been an important increase in the amount of data collected from animal-mounted sensors (bio-probes) which are often used to study the animals' behaviour or environment. We focus here on an example of the latter, where the interest is in sea surface temperature (SST), and measurements are taken from sensors mounted on elephant seals in the southern Indian Ocean. We show that standard geostatistical models may not be reliable for this type of data, due to the possi...
-
作者:Fukuyama, Julia
作者单位:Indiana University System; Indiana University Bloomington
摘要:Exploratory analysis is an important first step for discovering latent structure and generating hypotheses in large biological data sets. However, when the number of variables is large compared to the number of samples, standard methods such as principal components analysis give results that are unstable and difficult to interpret. Here, we present adaptive generalized principal components analysis (adaptive gPCA), a new method that solves these problems by incorporating information about the ...