War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases
成果类型:
Article
署名作者:
Small, Dylan S.; Rosenbaum, Paul R.
署名单位:
University of Pennsylvania
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214507000001247
发表日期:
2008
页码:
924-933
关键词:
confidence-intervals
design sensitivity
regression
identification
adjustment
inference
earnings
smoking
sample
摘要:
An instrument manipulates a treatment that it does not entirely control, but the instrument affects the outcome only indirectly through its maipulation of the treatment. The idealized prototype is the randomized encouragement design, in which subjects are randomly assigned to receive either encouragement to accept the treatment or no such encouragement, but not all subjects comply by doing what they are encouraged to do, and the situation is such that only the treatment itself, not disregarded encouragement alone, can affect the outome. An instrument is weak if it has only a slight impact on acceptance of the treatment, that is, if most people disregard encouragement to accept the treatment. Typical applications of instrumental variables are not ideal; encouragement is not randomized, although it may be assigned in a far less biased manner than the treatment itself. Using the concept of design sensitivity, we s tudy the sensitivity of instrumental variable analyses to departures from the ideal of random assignment of encouragement, with particular reference to the strength of the instrument. With these issues in mind, we reanalyze a clever study by Angrist and Krueger concerning the effects of military service during World War II on subsequent earnings, in which cohorts of very similar but not identical age were differently encouraged to serve in the war. A striking feature of this example is that those who served earned more, but the effect of service on earnings appears to be negative: that is the instrumental variables analysis reverses the sign of the naive comparison. For expository purposes, this example has the convenient feature of enabling, by selecting different birth cohorts, the creation of instruments of varied strength, from extremely weak to fairly strong, although separated by the same time interval and thus perhaps similarly biased. No matter how large the sample size becomes, even if the effect under study is quite large, studies with weak instruments are extremely sensitive to tiny biases, whereas studies with stronger instruments can be insensitive to moderate biases.