-
作者:Mccartan, Cory; Imai, Kosuke
作者单位:Harvard University
摘要:Random sampling of graph partitions under constraints has become a popular tool for evaluating legislative redistricting plans. Analysts detect partisan gerrymandering by comparing a proposed redistricting plan with an ensemble of sampled alternative plans. For successful application sampling methods must scale to maps with a moderate or large number of districts, incorporate realistic legal constraints, and accurately and efficiently sample from a selected target distribution. Unfortunately, ...
-
作者:Dai, Ben; Shen, Xiaotong; Chen, Lin yee; Li, Chunlin; Pan, Wei
作者单位:Chinese University of Hong Kong; University of Minnesota System; University of Minnesota Twin Cities; University of Minnesota System; University of Minnesota Twin Cities; University of Minnesota System; University of Minnesota Twin Cities; University of Minnesota System; University of Minnesota Twin Cities
摘要:In explainable artificial intelligence, discriminative feature localization is critical to reveal a black-box model's decision-making process from raw data to prediction. In this article we use two real datasets, the MNIST handwritten digits and MIT-BIH electrocardiogram (ECG) signals, to motivate key characteristics of discriminative features, namely, adaptiveness, predictive importance and effectiveness. Then we develop a localization framework, based on adversarial attacks, to effectively l...
-
作者:De Iorio, Maria; Favaro, Stefano; Guglielmi, Alessandra; Ye, Lifeng
作者单位:National University of Singapore; University of London; University College London
摘要:The study of temporal dynamics of gender and ethnic stereotypes is an important topic in many disciplines at the intersection between statistics and social sciences. In this paper we make use of word embeddings, a common tool in natural language processing and of Bayesian nonparametric mixture modeling for the analysis of temporal dynamics of gender stereotypes in adjectives and occupation over the 20th and 21st centuries in the United States. Our Bayesian nonparametric approach relies on a no...
-
作者:Huang, Theodore; Ploenzke, Matthew; Braun, Danielle
作者单位:Harvard University; Harvard T.H. Chan School of Public Health
摘要:Pedigree data contain family history information that is used to analyze hereditary diseases. These clinical data sets may contain duplicate records due to the same family visiting a clinic multiple times or a clinician entering multiple versions of the family for testing purposes. Inferences drawn from the data or using them for training or validation without removing the duplicates could lead to invalid conclusions, and hence identifying the duplicates is essential. Since family structures c...
-
作者:Zhang, Hong; Liu, Ming; Jin, Jiashun; Wu, Zheyang
作者单位:Pfizer; Pfizer USA; Worcester Polytechnic Institute; Carnegie Mellon University
摘要:The SNP-set analysis is a powerful tool for dissecting the genetics of complex human diseases. There are three fundamental genetic association approaches to SNR-set analysis: the marginal model fitting approach, the joint model fitting approach, and the decorrelation approach. A problem of primary interest is how these approaches compare with each other. To address this problem, we develop a theoretical platform to compare the signal-tonoise ratio (SNR) of these approaches under the generalize...
-
作者:Yang, By yang; Deng, K. E.
作者单位:Nankai University; Tsinghua University; Tsinghua University
摘要:Discovering association patterns of items from a collection of baskets composed of different items is an important problem in various fields. Assum-ing that each basket is composed of themes of items randomly sampled from a theme dictionary, the theme dictionary model provides a general framework to achieve efficient association pattern discovery with statistical inference. This paper extends the original theme dictionary model by allowing more than one category of items in a basket and only p...
-
作者:Xia, Lu; Nan, Bin; Li, Yi
作者单位:University of Michigan System; University of Michigan; University of California System; University of California Irvine
摘要:The Scientific Registry of Transplant Recipients (SRTR) system has be -come a rich resource for understanding the complex mechanisms of graft failure after kidney transplant, a crucial step for allocating organs effectively and implementing appropriate care. As transplant centers that treated patients might strongly confound graft failures, Cox models stratified by centers can eliminate their confounding effects. Also, since recipient age is a proven non-modifiable risk factor, a common practi...
-
作者:Crawford, Amy m.; Ommen, Danica m.; Carriquiry, Alicia l.
作者单位:Berry Consultants, LLC; Iowa State University
摘要:Forensic handwriting examiners are often tasked with identifying the writer of a particular document. Examples of handwriting evidence include ransom notes, forged documents and signatures, and threatening letters. At present, examiners rely on visual inspection of similarities and differences between the questioned document and reference writing samples. Here, we propose a principled modeling approach to compute the posterior predictive probability of writership when the author of the questio...
-
作者:Williams, Jonathan P.; Ommen, Danica M.; Hannig, Jan
作者单位:North Carolina State University; Iowa State University; University of North Carolina; University of North Carolina Chapel Hill; National Institute of Standards & Technology (NIST) - USA
摘要:One formulation of forensic identification of source problems is to determine the source of trace evidence, for instance, glass fragments found on a suspect for a crime. The current state of the science is to compute a Bayes factor comparing the marginal distribution of measurements of trace evidence under two competing propositions for whether or not the unknown source evidence originated from a specific source. The obvious problem with such an approach is the ability to tailor the prior dist...
-
作者:Shi, Chengchun; Wan, Runzhe; Song, Ge; Luo, Shikai; Zhu, Hongtu; Song, Rui
作者单位:University of London; London School Economics & Political Science; North Carolina State University; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine
摘要:The two-sided markets, such as ride-sharing companies, often involve a group of subjects who are making sequential decisions across time and/or location. With the rapid development of smart phones and internet of things, they have substantially transformed the transportation landscape of human beings. In this paper we consider large-scale fleet management in ride-sharing companies that involve multiple units in different areas receiving sequences of products (or treatments) over time. Major te...