-
作者:Maugis, P. A.
作者单位:University of London; University College London
摘要:Subgraph counts, in particular the number of occurrences of small shapes such as triangles, characterize properties of random networks. As a result, they have seen wide use as network summary statistics. Subgraphs are typically counted globally, making existing approaches unable to describe vertex-specific characteristics. In contrast, rooted subgraphs focus on vertex neighbourhoods, and are fundamental descriptors of local network properties. We derive the asymptotic joint distribution of roo...
-
作者:Aue, Alexander; Burman, Prabir
作者单位:University of California System; University of California Davis
摘要:The accurate estimation of prediction errors in time series is an important problem, which has immediate implications for the accuracy of prediction intervals as well as the quality of a number of widely used time series model selection criteria such as the Akaike information criterion. Except for simple cases, however, it is difficult or even impossible to obtain exact analytical expressions for one-step and multi-step predictions. This may be one of the reasons that, unlike in the independen...
-
作者:Cronie, Ottmar; Moradi, Mehdi; Biscio, Christophe A. N.
作者单位:Chalmers University of Technology; Umea University; Aalborg University
摘要:Motivated by the general ability of cross-validation to reduce overfitting and mean square error, we develop a cross-validation-based statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross-validation and prediction errors. Our cross-validation approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point process...
-
作者:Hong, Shaoxin; Jiang, Jiancheng; Jiang, Xuejun; Wang, Haofeng
作者单位:Shandong University; University of North Carolina; University of North Carolina Charlotte; Southern University of Science & Technology
摘要:It is routine practice in statistical modelling to first select variables and then make inference for the selected model as in stepwise regression. Such inference is made upon the assumption that the selected model is true. However, without this assumption, one would not know the validity of the inference. Similar problems also exist in high-dimensional regression with regularization. To address these problems, we propose a dimension-reduced generalized likelihood ratio test for generalized li...
-
作者:Abadir, Karim M.; Lubrano, Michel
作者单位:Imperial College London; Aix-Marseille Universite
摘要:We show that least-squares cross-validation methods share a common structure that has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student-t(nu) kernel, the cross-validation criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple and noniterative, thus leading to very fast computations, their integrated squared-error dominates tradi...
-
作者:Goeman, Jelle J.; Solari, Aldo
作者单位:Leiden University - Excl LUMC; Leiden University; Leiden University Medical Center (LUMC); University of Milano-Bicocca
摘要:We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include basic data splitting as well as modern data-carving methods and post-selection inference methods ...
-
作者:Lewis, R. M.; Battey, H. S.
作者单位:Imperial College London
摘要:Direct use of the likelihood function typically produces severely biased estimates when the dimension of the parameter vector is large relative to the effective sample size. With linearly separable data generated from a logistic regression model, the loglikelihood function asymptotes and the maximum likelihood estimator does not exist. We show that an exact analysis for each regression coefficient produces half-infinite confidence sets for some parameters when the data are separable. Such conc...
-
作者:Maity, Subha; Dutta, Diptavo; Terhorst, Jonathan; Sun, Yuekai; Banerjee, Moulinath
作者单位:University of Michigan System; University of Michigan; National Institutes of Health (NIH) - USA; NIH National Cancer Institute (NCI); NIH National Cancer Institute- Division of Cancer Epidemiology & Genetics
摘要:We present new models and methods for the posterior drift problem where the regression function in the target domain is modelled as a linear adjustment, on an appropriate scale, of that in the source domain, and study the theoretical properties of our proposed estimators in the binary classification problem. The core idea of our model inherits the simplicity and the usefulness of generalized linear models and accelerated failure time models from the classical statistics literature. Our approac...
-
作者:Su, Yongchang; Li, Xinran
作者单位:University of Illinois System; University of Illinois Urbana-Champaign
摘要:Evaluating the treatment effect has become an important topic for many applications. However, most existing literature focuses mainly on average treatment effects. When the individual effects are heavy tailed or have outlier values, not only may the average effect not be appropriate for summarizing treatment effects, but also the conventional inference for it can be sensitive and possibly invalid due to poor large-sample approximations. In this paper we focus on quantiles of individual treatme...
-
作者:Yu, X.; Zhu, J.
作者单位:University of Michigan System; University of Michigan
摘要:In many real-world networks, it is often observed that subgraphs or higher-order structures of certain configurations, e.g., triangles and by-fans, are overly abundant compared to standard randomly generated networks (). However, statistical models accounting for this phenomenon are limited, especially when community structure is of interest. This limitation is coupled with a lack of community detection methods that leverage subgraphs or higher-order structures. In this paper, we propose a new...