Interpolation

Estimating unknown values between observed data points (for example filling missing values within a time series).

Interpolation is the practice of estimating values between observed data points. Economists use it to fill gaps in datasets (for example missing months inside a time series), but it can also introduce bias if the interpolation method imposes patterns that are not truly in the data.

Linear interpolation (common baseline)

Suppose you observe ((x_0,y_0)) and ((x_1,y_1)) with (x_0 < x < x_1). Linear interpolation estimates (y(x)) by drawing a straight line between the two points:

[ \hat y(x) = y_0 + (y_1 - y_0),\frac{x-x_0}{x_1-x_0}. ]

This is simple and transparent, but it assumes the variable changes smoothly at a constant rate between observations.

Interpolation vs extrapolation

  • Interpolation: estimates inside the observed range.
  • Extrapolation: extends beyond the observed range (typically riskier).

Why interpolation can be risky in economics

Interpolation can:

  • smooth away volatility (important for business-cycle analysis),
  • distort dynamics (autocorrelation and persistence),
  • create false precision if treated as real observed data in regressions.

A good practice is to flag interpolated values, test sensitivity to alternative methods, and avoid interpolating variables where the underlying process is known to be jumpy (policy changes, discrete shocks).

Practical example

If a quarterly variable is converted to a monthly series by interpolation, month-to-month variation may be mechanically imposed rather than measured. This can affect estimated relationships in monthly regressions or forecasting models.

Knowledge Check

### What is interpolation? - [x] Estimating values between observed data points - [ ] Predicting values beyond the observed range - [ ] Proving a theorem in game theory - [ ] Converting nominal values into real values > **Explanation:** Interpolation fills gaps inside the observed range; extrapolation extends beyond it. ### What assumption does linear interpolation effectively impose between two points? - [x] The variable changes at a constant rate (a straight line) between the observations - [ ] The variable follows a random walk - [ ] The variable is always stationary - [ ] The variable’s variance is zero > **Explanation:** Linear interpolation draws a straight line between \((x_0,y_0)\) and \((x_1,y_1)\), implying smooth, constant change in between. ### Why can interpolation be risky in time-series work? - [x] It can smooth volatility and distort persistence/autocorrelation if treated as real data - [ ] It always increases sample size without changing anything else - [ ] It guarantees unbiased causal estimates - [ ] It removes measurement error by construction > **Explanation:** Interpolated values are model-based estimates; using them as if they were observed can change dynamics and inference.