-
作者:Rios, Nicholas; Xue, Lingzhou; Zhan, Xiang
作者单位:George Mason University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Peking University; Peking University
摘要:It is quite common to encounter compositional data in a regression framework in data analysis. When both responses and predictors are compositional, most existing models rely on a family of log-ratio based transformations to move the analysis from the simplex to the reals. This often makes the interpretation of the model more complex. A transformation-free regression model was recently developed, but it only allows for a single compositional predictor. However, many datasets include multiple c...
-
作者:Zhang, Siliang; Kuha, Jouni; Steele, Fiona
作者单位:East China Normal University; University of London; London School Economics & Political Science
摘要:We define a model for the joint distribution of multiple continuous latent variables, which includes a model for how their correlations depend on explanatory variables. This is motivated by and applied to social scientific research questions in the analysis of intergenerational help and support within families, where the correlations describe reciprocity of help between generations and complementarity of different kinds of help. We propose an MCMC procedure for estimating the model which maint...
-
作者:Goncalves, Jussiane nader; Barreto-Souza, Wagner; Ombao, Hernando
作者单位:Universidade Federal de Minas Gerais; University College Dublin; King Abdullah University of Science & Technology
摘要:In this paper we study the number of inpatient admissions by individuals to hospital emergency rooms reported by the 2003 Medical Expenditure Panel Survey (MEPS), which the United States Agency for Health Research and Quality conducts. Explanatory variables such as health status, access, use, and costs of health services in the U.S.A. are considered. Our main goal is to properly model the number of inpatient admissions, according to the geographical U.S. regions, as a tool for measuring the vo...
-
作者:Hasler, Byjill; Ma, Yanyuan; Wei, Yizheng; Parikh, Ravi; Chen, Jinbo
作者单位:Fox Chase Cancer Center; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; University of South Carolina System; University of South Carolina Columbia; University of Pennsylvania; University of Pennsylvania; University of Pennsylvania
摘要:When using electronic health records (EHRs) for clinical and translational research, additional data is often available from external sources to enrich the information extracted from EHRs. For example, academic biobanks have more granular data available, and patient reported data is often collected through small-scale surveys. It is common that the external data is available only for a small subset of patients who have EHR information. We propose efficient and robust methods for building and e...
-
作者:Zhou, Qinyi; Zuo, Chandler; Zhang, Yuannyu; Chen, Min; Xu, Jian; Shin, Sunyoung
作者单位:University of Texas System; University of Texas Dallas; St Jude Children's Research Hospital; Pohang University of Science & Technology (POSTECH)
摘要:Mutations in the noncoding DNA, which represents approximately 99% of the human genome, have been crucial to understanding disease mechanisms through dysregulation of disease-associated genes. One key element in gene regulation that noncoding mutations mediate is the binding of proteins to DNA sequences. Insertion and deletion of bases (InDels) are the second most common type of mutations, following single nucleotide polymorphisms, that may impact protein-DNA binding. However, no existing meth...
-
作者:Andholtz, Athan; Wu, Lucas; Uterman, Artin; Chan, Timothy c. y.
作者单位:Brigham Young University; University of British Columbia; University of Toronto
摘要:For decades National Football League (NFL) coaches' observed fourth down decisions have been largely inconsistent with prescriptions based on statistical models. In this paper we develop a framework to explain this discrepancy using an inverse optimization approach. We model the fourth down decision and the subsequent sequence of plays in a game as a Markov decision process (MDP), the dynamics of which we estimate from NFL playby-play data from the 2014 through 2022 seasons. We assume that coa...
-
作者:Xu, Shuntuo; Yu, Zhou; Ming, Jingsi
作者单位:East China Normal University; East China Normal University
摘要:Effective integration of single-cell data can facilitate the discovery of cell-type specific gene expression patterns and cellular interactions, ultimately leading to a better understanding of various biological processes and diseases. However, datasets from different platforms, species, and modalities exhibit various levels of heterogeneities, posing significant challenges in data alignment using a unified approach. Here we propose DeepMap, a flexible and efficient method for single-cell data...
-
作者:Yu, Miaomiao; Jiang, Zhongfeng; Li, Jiaxuan; Zhou, Yong
作者单位:East China Normal University; East China Normal University; Chinese Academy of Sciences; Academy of Mathematics & System Sciences, CAS
摘要:Personal credits have always been a hot topic in the society. Among all of them, the evaluation of default risk is particularly concerned since robust estimation, based on personal information, can both help needy individuals to get loans and financial institutions to avoid losses. So far, there have been no good solutions due to limited data, especially default information. With the advent of the era of big data, it is possible to improve the effectiveness of estimates by using auxiliary info...
-
作者:Ge, Lin; Zhang, Yuzi; Waller, Lance; Lyles, Robert
作者单位:Emory University
摘要:Monitoring key elements of disease dynamics (e.g., prevalence, case counts) is of great importance in infectious disease prevention and control, as emphasized during the COVID-19 pandemic. To facilitate this effort, we propose a new capture-recapture (CRC) analysis strategy that adjusts for misclassification stemming from the use of easily administered but imperfect diagnostic test kits, such as rapid antigen test-kits or saliva tests. Our method is based on a recently proposed anchor stream d...
-
作者:Uzhilina, Lena; Astie, Trevor; Segal, Mark
作者单位:University of Toronto; Stanford University; University of California System; University of California Irvine
摘要:Reconstructing three-dimensional (3D) chromatin structure from conformation capture assays (such as Hi-C) is a critical task in computational biology, since chromatin spatial architecture plays a vital role in numerous cellular processes and direct imaging is challenging. Most existing algorithms that operate on Hi-C contact matrices produce reconstructed 3D configurations in the form of a polygonal chain. However, none of the methods exploit the fact that the target solution is a (smooth) cur...