-
作者:Moon, Haeun; Du, Jin-Hong; Lei, Jing; Roeder, Kathryn
作者单位:Seoul National University (SNU); Carnegie Mellon University
摘要:Quantitative measurements produced by mass spectrometry proteomics experiments offer a direct way to explore the role of proteins in molecular mechanisms. However, analysis of such data is challenging due to the large proportion of missing values. A common strategy to address this issue is to utilize an imputed dataset, which often introduces systematic bias into downstream analyses if the imputation errors are ignored. In this paper we propose a statistical framework, inspired by doubly robus...
-
作者:Sakitis, Chase J.; Rowe, Daniel B.
作者单位:Marquette University
摘要:In fMRI, capturing brain activation during a task is dependent on how quickly k-space arrays are obtained. Acquiring full k-space arrays, which are reconstructed into images using the inverse Fourier transform (IFT), that make up volume images can take a considerable amount of scan time. Undersampling k-space reduces the acquisition time but results in aliased, or (GRAPPA) is a parallel imaging technique that yields full images from subsampled arrays of k-space. GRAPPA uses localized interpola...
-
作者:Su, Siqiang; Li, Zhenghao; Feng, Long; Li, Ting
作者单位:University of Hong Kong; Hong Kong Polytechnic University
摘要:Imaging genetics is a growing field that employs structural or functional neuroimaging techniques to study individuals with genetic risk variants potentially linked to specific illnesses. This area presents considerable challenges to statisticians due to the heterogeneous information and different data forms it involves. In addition, both imaging and genetic data are typically high-dimensional, creating a big data squared problem, Moreover, brain imaging data contains extensive spatial informa...
-
作者:Bargagli-Stoffi, Falco J.; Tortu, Costanza; Forastiere, Laura
作者单位:University of California System; University of California Los Angeles; Scuola Superiore Sant'Anna; Yale University
摘要:The bulk of causal inference studies rule out the presence of interference between units. However, in many real-world scenarios, units are interconnected by social, physical, or virtual ties, and the effect of the treatment can spill from one unit to other connected individuals in the network. In this paper, we develop a machine learning method that uses tree-based algorithms and a Horvitz-Thompson estimator to assess the heterogeneity of treatment and spillover effects with respect to individ...
-
作者:Huang, Melody
作者单位:Yale University
摘要:Estimating externally valid causal effects is a foundational problem in the social and biomedical sciences. Generalizing or transporting causal estimates from an experimental sample to a target population of interest relies on an overlap (or positivity) assumption between the experimental sample and the target population. In practice, having full overlap between an experimental sample and a target population can be implausible. In the following paper, we introduce a framework for considering e...
-
作者:Haidar-Wehbe, Sami; Emerson, Samuel R.; Aslett, Louis J. M.; Liley, James
作者单位:Durham University
摘要:Predictive risk scores for adverse outcomes are increasingly crucial in guiding health interventions. Such scores may need to be periodically updated due to change in the distributions they model. However, directly updating risk scores used to guide intervention can lead to biased risk estimates. To address this, we propose updating using a holdout set, a subset of the population that does not receive interventions guided by the risk score. Balancing the holdout set size is essential to ensure...
-
作者:Holthuijzen, Maike f.; Gramacy, Robert b.; Carey, Cayelan c.; Higdon, David m.; Thomas, R. quinn
作者单位:Virginia Polytechnic Institute & State University; Virginia Polytechnic Institute & State University; Virginia Polytechnic Institute & State University; Virginia Polytechnic Institute & State University; Virginia Polytechnic Institute & State University; Virginia Polytechnic Institute & State University
摘要:We present a novel forecasting framework for lake water temperature, which is crucial for managing lake ecosystems and drinking water resources. The General Lake Model (GLM) has been previously used for this purpose, but, similar to many process-based simulation models, it requires a large number of inputs (many of which are stochastic), presents challenges for uncertainty quantification (UQ), and can exhibit model bias. To address these issues, we propose a Gaussian process (GP) surrogate-bas...
-
作者:Pramanik, Sandipan; Zeger, Scott; Blau, Dianna; Datta, Abhirup
作者单位:Johns Hopkins University; Centers for Disease Control & Prevention - USA
摘要:Verbal autopsy (VA) algorithms are routinely used to determine individual-level causes of death (COD) in many low-and-middle-income countries. The individual CODs are then aggregated to derive population-level cause-specific mortality fractions (CSMF), which are essential to informing public health policies. However, VA algorithms frequently misclassify COD and introduce bias in CSMF estimates. A recent method, VA-calibration, can correct for this bias using a VA misclassification rate matrix ...
-
作者:Zhan, Zishu; Liu, Zhishuai; Lin, Cunjie; Yi, Danhui; Liu, Jian; Yang, Yufei
作者单位:Southern Medical University - China; Duke University; Renmin University of China; Renmin University of China; Beijing University of Chinese Medicine
摘要:Dynamic treatment regimes (DTRs) represent sequential decision rules for multiple intervention stages. Each rule maps patients' covariates to optional treatments. The optimal dynamic treatment regime is the one that maximizes the mean outcome of interest if followed by the overall population. Motivated by a clinical study on the treatment of advanced colorectal cancer with traditional Chinese medicine, we propose a censored C-learning (CC-learning) method to estimate the DTR with multiple trea...
-
作者:Ibrahim, Shibal; Radchenko, Peter; Ben-David, Emanuel; Mazumder, Rahul
作者单位:Massachusetts Institute of Technology (MIT); University of Sydney; Massachusetts Institute of Technology (MIT)
摘要:In this paper we consider the problem of predicting survey response rates using a family of flexible and interpretable nonparametric models. The study is motivated by the U.S. Census Bureau's well-known ROAM application, which uses a linear regression model trained on the U.S. Census Planning Database data to identify hard-to-survey areas. A crowdsourcing competition (Public Opin. Q. 81 (2016) 144-156) organized more than 10 years ago revealed that machine learning methods, based on ensembles ...