Response best-subset selector for multivariate regression with high-dimensional response variables

成果类型:
Article
署名作者:
Hu, Jianhua; Huang, Jian; Liu, Xiaoqian; Liu, Xu
署名单位:
Shanghai University of Finance & Economics; Hong Kong Polytechnic University; York University - Canada
刊物名称:
BIOMETRIKA
ISSN/ISSBN:
0006-3444
DOI:
10.1093/biomet/asac037
发表日期:
2023
页码:
205223
关键词:
copy-number variation False Discovery Rate breast-cancer efficient estimation IMPACT Lasso
摘要:
This article investigates the statistical problem of response-variable selection with high-dimensional response variables and a diverging number of predictor variables with respect to the sample size in the framework of multivariate linear regression. A response best-subset selection model is proposed by introducing a 0-1 selection indicator for each response variable, and then a response best-subset selector is developed by introducing a separation parameter and a novel penalized least-squares function. The proposed procedure can perform response-variable selection and regression-coefficient estimation simultaneously, and the response best-subset selector has the property of model consistency under mild conditions for both fixed and diverging numbers of predictor variables. Also, consistency and asymptotic normality of regression-coefficient estimators are established for cases with a fixed dimension, and it is found that the Bonferroni test is a special response best-subset selector. Finite-sample simulations show that the response best-subset selector has strong advantages over existing competitors in terms of the Matthews correlation coefficient, a criterion that aims to balance accuracies for both true and false response variables. An analysis of real data demonstrates the effectiveness of the response best-subset selector in an application involving the identification of dosage-sensitive genes.