Multicollinearity: How common factors cause Type 1 errors in multivariate regression

成果类型:
Article
署名作者:
Kalnins, Arturs
署名单位:
University of Iowa
刊物名称:
STRATEGIC MANAGEMENT JOURNAL
ISSN/ISSBN:
0143-2095
DOI:
10.1002/smj.2783
发表日期:
2018
页码:
2362-2385
关键词:
analytic model econometrics multicollinearity multivariate regression research methods
摘要:
Research Summary: In multivariate regression analyses of correlated variables, we sometimes observe pairs of estimated beta coefficients large in absolute magnitude and opposite in sign. T-statistics are also large, suggesting meaningful findings. I found 64 recently published Strategic Management Journal articles with results exhibiting these characteristics. In this article, I demonstrate that such results may be Type 1 errors (false positives): If regressors are correlated via an unobservable common factor, estimated beta coefficients will misleadingly tend toward infinite magnitudes in opposite directions, even if the variables' real effects are small and of the same sign. Diagnostics such as Variance Inflation Factors (VIF) will misleadingly validate Type 1 errors as legitimate results. After establishing general results via mathematical analysis and simulation, I provide guidelines for detection and mitigation. Managerial Summary: This article demonstrates mathematically how regression analyses with correlated independent variables may generate beta coefficients of opposite sign to the variables' true effects. To assess the likelihood of this possibility, I propose that: if (a) absolute correlation of two independent variables is about +/- 0.3 or more (smaller correlations may be problematic for large data sets), (b) the two variables have beta coefficients of opposite sign, if correlated positively, and of the same sign, if correlated negatively, and (c) the bivariate correlation of one independent variable with the dependent variable is of the opposite sign from the beta coefficient, then the beta might be a false positive. To facilitate such analysis, authors should provide complete correlation tables, including dependent variables, interaction terms, and quadratic terms.