-
作者:Wang, Feifei; Xu, Shaodong; Qin, Yichen; Shen, Ye; Li, Yang
作者单位:Renmin University of China; Renmin University of China; University System of Ohio; University of Cincinnati; University System of Georgia; University of Georgia
摘要:Customer segmentation has wide applications in business activities, such as personalized marketing and targeted product development. To realize customer segmentation, clustering methods are commonly used. However, modern customer segmentation encounters challenges characterized by highdimensionality and mixed-type variables (i.e., the mixture of continuous variables and categorical variables). It brings great challenges to customer segmentation, because most existing clustering methods are onl...
-
作者:Mulder, Joris; Hoff, Peter D.
作者单位:Tilburg University; Duke University
摘要:Directional relational event data, such as email data, often contain unicast messages (i.e., messages of one sender toward one receiver) and multicast messages (i.e., messages of one sender toward multiple receivers). The Enron email data that is the focus in this paper consists of 31% multicast messages. Multicast messages contain important information about the roles of actors in the network, which is needed for better understanding social interaction dynamics. In this paper a multiplicative...
-
作者:Wang, Xin; Zhang, Jing
作者单位:California State University System; San Diego State University; University System of Ohio; Miami University
摘要:Motivated by the need to assess consistency in the outcomes of aquatic toxicity tests conducted by different labs at different time points, we propose a clustering of variance method in linear mixed models. The proposed method, referred as CVM, is able to identify the cluster structure of the variances and estimate model parameters simultaneously. In our proposed method, a penalized approach based on pairwise penalties is proposed to identify the cluster structure. We construct an optimization...
-
作者:Wang, Yanzhao; Liu, Haitao; Zou, Jian; Ravishanker, Nalini
作者单位:Worcester Polytechnic Institute; Worcester Polytechnic Institute; University of Connecticut
摘要:In high-frequency financial data, dynamic patterns of transaction counts in regular time intervals provide crucial insights into market microstructure, such as short-term trading activities and intermittent intensities of price oscillation. In this paper we propose a Bayesian hierarchical framework that incorporates correlated latent level and temporal effects to model multivariate count data during intraday transaction intervals. Built on the INLA method for implementation, our framework prov...
-
作者:Consagra, William; Cole, Martin; Qiu, Xing; Zhang, Zhengwu
作者单位:Harvard University; Harvard Medical School; University of Rochester; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine
摘要:Brain structural networks are often represented as discrete adjacency matrices with elements summarizing the connectivity between pairs of regions of interest (ROIs). These ROIs are typically determined a priori using a brain atlas. The choice of atlas is often arbitrary and can lead to a loss of important connectivity information at the sub-ROI level. This work introduces an atlasfree framework that overcomes these issues by modeling brain connectivity using smooth random functions. In partic...
-
作者:Huynh, Huu-dinh; Schofield, Matthew; Hwang, Wen-han
作者单位:National Chung Hsing University; National Tsing Hua University
摘要:We propose an enhanced site occupancy model for analyzing ecological detection/nondetection data obtained from multiple visits. The model distinguishes between abundance, occupancy, and detection probabilities. We allow for transient individuals through a community parameter, c, that characterizes the proportion of individuals fixed across visits. This parameter seamlessly transitions from the standard occupancy model (c = 0) to the Nmixture model (c = 1), enabling a more accurate analysis of ...
-
作者:Williams, Jonathan P.; Hermansen, Gudmund H.; Strand, Havard; Clayton, Govinda; Nygard, Havard Mokleiv
作者单位:North Carolina State University; University of Oslo; Swiss Federal Institutes of Technology Domain; ETH Zurich; Peace Research Institute Oslo (PRIO)
摘要:A crucial challenge for solving problems in conflict research is in leveraging the semisupervised nature of the data that arise. Observed response data, such as counts of battle deaths over time, indicate latent processes of interest, such as intensity and duration of conflicts, but defining and labeling instances of these unobserved processes requires nuance and imprecision. The availability of such labels, however, would make it possible to study the effect of intervention-related predictors...
-
作者:Li, Shaobo; Fan, Zhaohu; Liu, Ivy; Morrison, Philip S.; Liu, Dungang
作者单位:University of Kansas; University System of Georgia; Georgia Institute of Technology; Victoria University Wellington; Victoria University Wellington; University System of Ohio; University of Cincinnati
摘要:This paper is motivated by the analysis of a survey study focusing on college student well-being before and after the COVID-19 pandemic outbreak. A statistical challenge in well-being studies lies in the multidimensionality of outcome variables, recorded in various scales such as continuous, binary, or ordinal. The presence of mixed data complicates the examination of their relationships when adjusting for important covariates. To address this challenge, we propose a unifying framework for stu...
-
作者:Li, Yujia; Liu, Peng; Wang, Wenjia; Ong, Wei; Fang, Yusi; Rren, Zhao; Tang, Lu; Celedon, Juan c.; Oesterreich, Steffi; Tseng, George c.
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:With advances in high-throughput technology, molecular disease subtyping by high-dimensional omics data has been recognized as an effective approach for identifying subtypes of complex diseases with distinct disease mechanisms and prognoses. Conventional cluster analysis takes omics data as input and generates patient clusters with similar gene expression pattern. The omics data, however, usually contain multifaceted cluster structures that can be defined by different sets of genes. If the gen...
-
作者:Hadj-Amar, Beniamino; Jewson, Jack; Vannucci, Marina
作者单位:Rice University; Pompeu Fabra University
摘要:We propose a sparse vector autoregressive ( VAR ) hidden semi-Markov model ( HSMM ) for modeling temporal and contemporaneous (e.g., spatial) dependencies in multivariate nonstationary time series. The HSMM's 's generic state distribution is embedded in a special transition matrix structure, facilitating efficient likelihood evaluations and arbitrary approximation accuracy. To promote sparsity of the VAR coefficients, we deploy an l1 1-ball projection prior, which combines differentiability wi...