When does multicollinearity bias coefficients and cause type 1 errors? A reconciliation of Lindner, Puck, and Verbeke (2020) with Kalnins (2018)
成果类型:
Article
署名作者:
Kalnins, Arturs
署名单位:
University of Iowa
刊物名称:
JOURNAL OF INTERNATIONAL BUSINESS STUDIES
ISSN/ISSBN:
0047-2506
DOI:
10.1057/s41267-022-00531-9
发表日期:
2022
页码:
1536-1548
关键词:
regression analysis
multicollinearity
biased coefficients
type 1 error
data generating process
摘要:
Lindner et al. (J Int Bus Stud 51:283-298, 2020; hereafter LPV) and Kalnins (Strateg Manag J 39(8):2362-2385, 2018) have published recent original analyses on multicollinearity, but the conclusions appear contradictory. LPV argue that multicollinearity does not affect the validity of regression coefficients, but only their reliability. In other words, multicollinearity does not bias coefficients, but only inflates standard errors. In Kalnins (2018), I conclude that multicollinearity may bias coefficients and cause type 1 errors (false positives). My goal here is to reconcile these two perspectives. I consider two data generating processes (DGPs) that create dependent variables (DVs) and apply them to specifications simulated by LPV. If the DV is generated by the Canonical DGP, that is, one which fully satisfies the Gauss-Markov assumptions, I show that previously derived econometric results generalize the conclusions of LPV. But if there are deviations from these conditions, such as in the case of a Common Factor DGP, multicollinearity acts as an amplifier of bias. I extend Kalnins' (2018) conclusions by analyzing LPV's specifications within the common factor context: in this case, incorporating all seemingly relevant, observable variables into a regression does not yield unbiased estimates. Coefficient estimates for variables of theoretical interest may be more accurate when correlated variables are omitted. While researchers may prefer estimates with or without a correlated variable included in a regression, both specifications should always be presented.
来源URL: