Adjusted R-Squared

A version of R-squared that penalizes models for adding predictors that do not improve fit enough.

Adjusted R-squared is a version of R-squared that penalizes a regression for adding explanatory variables that do not improve fit enough. It is useful because ordinary R-squared never falls when you add regressors, even if the new variable contributes almost nothing.

$$$$

The Formula

For a regression with n observations and k regressors excluding the intercept:

\[ \bar{R}^2 = 1 - (1 - R^2)\frac{n - 1}{n - k - 1} \]

The penalty comes from the degrees-of-freedom term. If a new variable raises ordinary R^2 only trivially, adjusted R^2 can fall.

Why Economists Use It

Adjusted R-squared is a quick in-sample diagnostic when comparing models estimated on the same dependent variable and dataset. It helps answer a practical question: did the added complexity buy enough explanatory improvement to justify itself?

This is especially useful in exploratory regression work where there is a temptation to keep adding predictors.

What It Does Not Do

Adjusted R-squared is not a full model-selection rule. It does not replace theory, diagnostic checking, or out-of-sample validation. A model with a higher adjusted R-squared may still have unstable coefficients, bad functional form, or poor predictive performance.

So it is best used as one screening tool alongside economic reasoning and other statistics.

Knowledge Check

### Why can adjusted R-squared be more informative than ordinary R-squared? - [x] Because it penalizes adding regressors that do not improve fit enough - [ ] Because it is always larger than R-squared - [ ] Because it measures causal effects directly - [ ] Because it ignores sample size > **Explanation:** Ordinary R-squared never decreases when variables are added, but adjusted R-squared can fall if the added variable adds little explanatory power. ### When is adjusted R-squared most appropriate as a comparison tool? - [ ] When comparing unrelated dependent variables across different datasets - [x] When comparing models for the same dependent variable on the same sample - [ ] When replacing all regression diagnostics - [ ] When testing market efficiency directly > **Explanation:** The statistic is most meaningful for like-for-like model comparisons. ### Why should adjusted R-squared not be the only model-selection criterion? - [ ] Because it cannot be written mathematically - [ ] Because it always chooses the largest model - [x] Because fit alone does not guarantee good specification, theory, or predictive performance - [ ] Because it is defined only for time series > **Explanation:** Econometric model choice also requires theory, residual diagnostics, and often out-of-sample evidence.