-
作者:Loh, Wet-Yin
作者单位:University of Wisconsin System; University of Wisconsin Madison
摘要:Besides serving as prediction models, classification trees are useful for finding important predictor variables and identifying interesting subgroups in the data. These functions can be compromised by weak split selection algorithms that have variable selection biases or that fail to search beyond local main effects at each node of the tree. The resulting models may include many irrelevant variables or select too few of the important ones. Either eventuality can lead to erroneous conclusions. ...
-
作者:Scott, James G.
作者单位:University of Texas System; University of Texas Austin
摘要:This paper describes a framework for flexible multiple hypothesis testing of autoregressive time series. The modeling approach is Bayesian, though a blend of frequentist and Bayesian reasoning is used to evaluate procedures. Nonparametric characterizations of both the null and alternative hypotheses will be shown to be the key robustification step necessary to ensure reasonable Type-I error performance. The methodology is applied to part of a large database containing up to 50 years of corpora...
-
作者:Qiu, Peihua; Yang, Rong; Potegal, Michael
作者单位:University of Minnesota System; University of Minnesota Twin Cities; Bristol-Myers Squibb; University of Minnesota System; University of Minnesota Twin Cities
摘要:Although anger is an important emotion that underlies much overt aggression at great social cost, little is known about how to quantify anger or to specify the relationship between anger and the overt behaviors that express it. This paper proposes a novel statistical model which provides both a metric for the intensity of anger and an approach to determining the quantitative relationship between anger intensity and the specific behaviors that it controls. From observed angry behaviors, we reco...
-
作者:Szekely, Gabor J.; Rizzo, Maria L.
作者单位:University System of Ohio; Bowling Green State University; Hungarian Academy of Sciences; HUN-REN; HUN-REN Alfred Renyi Institute of Mathematics
-
作者:Rossell, David
作者单位:Barcelona Institute of Science & Technology; Institute for Research in Biomedicine - IRB Barcelona
摘要:Hierarchical models are a powerful tool for high-throughput data with a small to moderate number of replicates, as they allow sharing information across units of information, for example, genes. We propose two such models and show its increased sensitivity in microarray differential expression applications. We build on the gamma-gamma hierarchical model introduced by Kendziorski et al. [Statist. Med. 22 (2003) 3899-3914] and Newton et al. [Biostatistics 5 (2004) 155-176], by addressing importa...
-
作者:Culp, Mark; Michailidis, George; Johnson, Kjell
作者单位:West Virginia University; University of Michigan System; University of Michigan; Pfizer; Pfizer USA
摘要:In many scientific settings data can be naturally partitioned into variable groupings called views. Common examples include environmental (1st view) and genetic information (2nd view) in ecological applications, chemical (1st view) and biological (2nd view) data in drug discovery. Multi-view data also occur in text analysis and proteomics applications where one view consists of a graph with observations as the vertices and a weighted measure of pairwise similarity between observations as the e...
-
作者:Yuan, Ming
作者单位:University System of Georgia; Georgia Institute of Technology
摘要:We consider nonparametric estimation of the state price density encapsulated in option prices. Unlike usual density estimation problems, we only observe option prices and their corresponding strike prices rather than samples from the state price density. We propose to model the state price density directly with a nonparametric mixture and estimate it using least squares. We show that although the minimization is taken over an infinitely dimensional function space, the minimizer always admits a...
-
作者:Mandal, Abhyuday; Ranjan, Pritam; Wu, C. F. Jeff
作者单位:University System of Georgia; University of Georgia; Acadia University; University System of Georgia; Georgia Institute of Technology
摘要:Identifying promising compounds from a vast collection of feasible compounds is an important and yet challenging problem in the pharmaceutical industry. An efficient solution to this problem will help reduce the expenditure at the early stages of drug discovery. In an attempt to solve this problem, Mandal, Wu and Johnson [Technometrics 48 (2006) 273-283] proposed the SELC algorithm. Although powerful, it fails to extract substantial information from the data to guide the search efficiently, as...
-
作者:Kim, Sungduk; Xi, Yingmei; Chen, Ming-Hui
作者单位:National Institutes of Health (NIH) - USA; NIH Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD); Biogen; University of Connecticut
摘要:To address an important risk classification issue that arises in clinical practice, we propose a new mixture model via latent cure rate markers for survival data with a cure fraction. In the proposed model, the latent cure rate markers are modeled via a multinomial logistic regression and patients who share the same cure rate are classified into the same risk group. Compared to available cure rate models, the proposed model fits better to data from a prostate cancer clinical trial. In addition...
-
作者:Hong, Yili; Meeker, William Q.; McCalley, James D.
作者单位:Iowa State University; Iowa State University
摘要:Prediction of the remaining life of high-voltage power transformers is an important issue for energy companies because of the need for planning maintenance and capital expenditures. Lifetime data for such transformers are complicated because transformer lifetimes can extend over many decades and transformer designs and manufacturing practices have evolved. We were asked to develop statistically-based predictions for the lifetimes of an energy company's fleet of high-voltage transmission and di...