-
作者:Li, Bing; Artemiou, Andreas; Li, Lexin
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Michigan Technological University; North Carolina State University
摘要:We introduce a principal support vector machine (PSVM) approach that can be used for both linear and nonlinear sufficient dimension reduction. The basic idea is to divide the response variables into slices and use a modified form of support vector machine to find the optimal hyperplanes that separate them. These optimal hyperplanes are then aligned by the principal components of their normal vectors. It is proved that the aligned normal vectors provide an unbiased, root n-consistent, and asymp...
-
作者:Cai, T. Tony; Yuan, Ming
作者单位:University of Pennsylvania; University System of Georgia; Georgia Institute of Technology
摘要:The problem of estimating the mean of random functions based on discretely sampled data arises naturally in functional data analysis. In this paper, we study optimal estimation of the mean function under both common and independent designs. Minimax rates of convergence are established and easily implementable rate-optimal estimators are introduced. The analysis reveals interesting and different phase transition phenomena in the two cases. Under the common design, the sampling frequency solely ...
-
作者:Huckemann, Stephan F.
作者单位:University of Gottingen
摘要:For planar landmark based shapes, taking into account the non-Euclidean geometry of the shape space, a statistical test for a common mean first geodesic principal component (GPC) is devised which rests on one of two asymptotic scenarios. For both scenarios, strong consistency and central limit theorems are established, along with an algorithm for the computation of a Ziezold mean geodesic. In application, this allows to verify the geodesic hypothesis for leaf growth of Canadian black poplars a...
-
作者:Aswani, Anil; Bickel, Peter; Tomlin, Claire
作者单位:University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:Collinearity and near-collinearity of predictors cause difficulties when doing regression. In these cases, variable selection becomes untenable because of mathematical issues concerning the existence and numerical stability of the regression coefficients, and interpretation of the coefficients is ambiguous because gradients are not defined. Using a differential geometric interpretation, in which the regression coefficients are interpreted as estimates of the exterior derivative of a function, ...
-
作者:Koopmeiners, Joseph S.; Feng, Ziding
作者单位:University of Minnesota System; University of Minnesota Twin Cities; Fred Hutchinson Cancer Center
摘要:The receiver operating characteristic (ROC) curve, the positive predictive value (PPV) curve and the negative predictive value (NPV) curve are three measures of performance for a continuous diagnostic biomarker. The ROC, PPV and NPV curves are often estimated empirically to avoid assumptions about the distributional form of the biomarkers. Recently, there has been a push to incorporate group sequential methods into the design of diagnostic biomarker studies. A thorough understanding of the asy...
-
作者:Lai, Tze Leung; Gross, Shulamith T.; Shen, David Bo
作者单位:Stanford University; City University of New York (CUNY) System; Baruch College (CUNY)
摘要:Probability forecasts of events are routinely used in climate predictions, in forecasting default probabilities on bank loans or in estimating the probability of a patient's positive response to treatment. Scoring rules have long been used to assess the efficacy of the forecast probabilities after observing the occurrence, or nonoccurrence, of the predicted events. We develop herein a statistical theory for scoring rules and propose an alternative approach to the evaluation of probability fore...
-
作者:Buecher, Axel; Dette, Holger; Volgushev, Stanislav
作者单位:Ruhr University Bochum
摘要:We propose a new class of estimators for Pickands dependence function which is based on the concept of minimum distance estimation. An explicit integral representation of the function A* (t), which minimizes a weighted L(2)-distance between the logarithm of the copula C(y(1-t), y(t)) and functions of the form A (t) log(y) is derived. If the unknown copula is an extreme-value copula, the function A* (t) coincides with Pickands dependence function. Moreover, even if this is not the case, the fun...
-
作者:Rohe, Karl; Chatterjee, Sourav; Yu, Bin
作者单位:University of California System; University of California Berkeley
摘要:Networks or graphs can easily represent a diverse set of data sources that are characterized by interacting units or actors. Social networks, representing people who communicate with each other, are one example. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is a popular and computationally feasible method to discover these communities. The stochastic blockmodel [Social Networks 5 (1983) 109-137] ...
-
作者:Pena, Edsel A.; Habiger, Joshua D.; Wu, Wensong
作者单位:University of South Carolina System; University of South Carolina Columbia; Oklahoma State University System; Oklahoma State University - Stillwater
摘要:Improved procedures, in terms of smaller missed discovery rates (MDR), for performing multiple hypotheses testing with weak and strong control of the family-wise error rate (FWER) or the false discovery rate (FDR) are developed and studied. The improvement over existing procedures such as the Sidak procedure for FWER control and the Benjamini-Hochberg (BH) procedure for FDR control is achieved by exploiting possible differences in the powers of the individual tests. Results signal the need to ...
-
作者:Chu, Tingjin; Zhu, Jun; Wang, Haonan
作者单位:Colorado State University System; Colorado State University Fort Collins; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
摘要:We consider the problem of selecting covariates ill spatial linear models with Gaussian process errors. Penalized maximum likelihood estimation (PMLE) that enables simultaneous variable selection and parameter estimation is developed and, for ease of computation, PMLE is approximated by one-step sparse estimation (OSE). To further improve computational efficiency, particularly with large sample sizes, we propose penalized maximum covariance-tapered likelihood estimation (PMLET) and its one-ste...