Anes codebook5/5/2023 From a policy or causal point of view, we want to know what happens to our dependent variable (income) if we change just a single independent variable (education), leaving all the others constant. With multiple regression, we can control for these other variables, to get the real effect of education on income (for instance). This is why multiple regression is so important. That is, the education (in this scenario) didn’t in fact cause the income boost, but rather it just gave the appearance of it, due to the fact that education was also tracking the real causes, hard work and parental income. In both of these cases, we might see a strong correlation and significant \(\beta\) relating education to income, but we might find that, had we invested the extra years in education, our income would be no higher than our peers with equal work ethics and similarly wealthy parents. Or perhaps people with wealthy parents both can afford more education, and have the social connections to get fancier jobs. But what if it’s the case that some third factor instead affects both? For instance, it may be that harder-working people both get more education, and work their way up in any subsequent jobs to get higher pay. We might conclude that education boosts income, and from a policy point of view, you might conclude that if you go back to school and get a couple more years of education, your income will go up. But what do we mean by “the effect”? Most fundamentally, we are interested in causality – what is the effect of changing X on Y? This is both a scientific question – how are they causally related – and a policy question: how can we change outcome Y by changing policy/action X? If X doesn’t actually cause Y, and instead they are (for instance) both causes by some third factor Z, then we might find a nice relationship between X and Y, but if we actually intervened and boosted X, Y wouldn’t change.įor instance, imagine we regress income on education level in a large, nationally-representative sample, and find a statistically signficant positive relationship. In the previous module we introduced bivariate regression for modeling the effect of a single independent variable X on our dependent variable Y.
0 Comments
Leave a Reply. |