Information Criterion

Background

In economics and statistics, model selection is crucial for ensuring that the models accurately reflect the underlying data without overfitting. The Information Criterion (IC) is a method used for this purpose, combining a measure of goodness-of-fit with a penalty for the complexity of the model.

Historical Context

Credits for the development of the popular information criteria often go to Hirotugu Akaike, who introduced the Akaike Information Criterion (AIC) in 1974. The Bayes-Schwarz Information Criterion (BIC), also known as Schwarz Criterion, was independently created by Gideon Schwarz in 1978, building on prior contributions in Bayesian statistics.

Definitions and Concepts

The Information Criterion balances the trade-off between model complexity and goodness-of-fit:

Akaike Information Criterion (AIC) measures the relative quality of statistical models for a given dataset.
Bayes Information Criterion (BIC) is similar to AIC but includes a more substantial penalty for models with more parameters, promoting simpler models.

The core formulae reveal this balance:

\( \text{AIC} = 2k - 2\ln(L) \)
\( \text{BIC} = k \ln(n) - 2\ln(L) \)

Where \( k \) is the number of estimated parameters, \( L \) is the likelihood function, and \( n \) is the number of observations.

Major Analytical Frameworks

Classical Economics

In classical economic analysis, model selection usually depends more on theoretical fit rather than statistical validation, rendering information criteria less commonly used.

Neoclassical Economics

Neoclassical economics often relies on empirical data. Information criteria aid in selecting models that provide a trade-off between accuracy and simplicity, ensuring robust theoretical constructs.

Keynesian Economic

Macroeconomic models in Keynesian economics may benefit from information criteria to avoid overly complex interpretations, aiding policymakers in decision-making.

Marxian Economics

The focus on critique in Marxian economics diminishes the direct requirement for statistical models, thus reducing the application and necessity for information criteria.

Institutional Economics

Given its divergence into historical and sociological contexts, fewer rigid theoretical models necessitate an elaborate use of criteria, though explanatory models can occasionally employ such techniques.

Behavioral Economics

Behavioral economics often employs empirical studies where model selection criteria help embody the less predictable human behaviors accurately within robust frameworks.

Post-Keynesian Economics

In post-Keynesian analysis, which deals more with structural and long-run economic policies, information criteria help in selecting more accurate models without over-representing variability.

Austrian Economics

The Austrian School, emphasizing qualitative methodologies, saw limited direct application of information criteria until recent empirical modeling trends emerged.

Development Economics

Here, model complexity and fitness are balanced to avoid factions or systems bias, making information criteria an important tool in policy development and impact analysis.

Monetarism

Monetarism’s reliance on quantitative details for money supply and inflation models benefits from applying informational criteria to prevent over-complicated interpretations.

Comparative Analysis

Comparing AIC and BIC often involves noting the stronger penalty for additional parameters in BIC, making it more stringent relative to AIC, encouraging simpler models where possible.

Case Studies

Empirical studies in econometrics often show applications where:

AIC and BIC similarly retain the most critical parameters.
BIC precludes unnecessary parameters more rigidly, promoting principled economic interpretations.

Suggested Books for Further Studies

“Model Selection and Model Averaging” by Yuhong Yang
“Econometric Theory and Methods” by Russell Davidson and James G. MacKinnon
“Information Criteria and Statistical Modeling” by Sadanori Konishi and Genshiro Kitagawa

Likelihood Function: A function that represents the probability of observed data under various parameter assumptions.
Goodness of Fit: Measures the degree of fit between model predictions and actual data observations.
Overfitting: Creating a model that captures noise rather than the underlying process, often alleviated by using information criteria.

$$$$

Quiz

### Which of the following criteria is known for applying a stronger penalty for more parameters? - [ ] Akaike Information Criterion (AIC) - [x] Bayesian Information Criterion (BIC) - [ ] R-squared - [ ] Mean Absolute Error (MAE) > **Explanation:** The BIC imposes a stronger penalty for the number of parameters compared to the AIC. ### What is a common purpose of both AIC and BIC? - [x] Model selection - [ ] Data collection - [ ] Hypothesis testing - [ ] Linear regression > **Explanation:** Both AIC and BIC are used for model selection in statistical modeling. ### True or False: AIC always chooses simpler models than BIC - [ ] True - [x] False > **Explanation:** BIC typically favors simpler models over AIC, as it imposes a higher penalty for additional parameters. ### What aspect is shared by both AIC and BIC? - [ ] They do not consider model complexity. - [x] They include a goodness-of-fit term. - [ ] They are calculated without using the likelihood function. - [ ] They exclude parameters count. > **Explanation:** Both criteria include a goodness-of-fit term but differ primarily in their penalization of the number of parameters. ### Which of the following best explains the concept of 'parsimony' in model selection? - [ ] Use of the most data-intensive model - [ ] Employing the most computational resources - [x] Achieving simplicity in the model - [ ] Avoidance of model validation > **Explanation:** Parsimony refers to choosing simpler models that explain the data well, avoiding unnecessary complexity. ### Which of the following is not considered a component of information criteria? - [x] Sample variance - [ ] Likelihood of the model - [ ] Penalty for additional parameters - [ ] Goodness-of-fit measure > **Explanation:** Sample variance is not a component of information criteria; they include a likelihood term and a penalty. ### Who developed the Akaike Information Criterion (AIC)? - [ ] Gideon E. Schwarz - [x] Hirotugu Akaike - [ ] Karl Pearson - [ ] Ronald Fisher > **Explanation:** The AIC was developed by Hirotugu Akaike in 1973. ### The definition of 'goodness-of-fit' in the context of information criteria? - [x] Measure of how well the model fits the data - [ ] Measure of model complexity - [ ] Measure of sample size - [ ] Penalty for model parameters > **Explanation:** Goodness-of-fit describes how well a statistical model explains or fits the observed data. ### What year was the BIC introduced? - [ ] 1973 - [ ] 1974 - [ ] 1975 - [x] 1978 > **Explanation:** The Bayesian-Schwarz Information Criterion (BIC) was introduced by Gideon E. Schwarz in 1978. ### True or False: Information criteria should be the sole basis for model selection. - [ ] True - [x] False > **Explanation:** Information criteria are important, but additional considerations like theoretical coherence, interpretability, and domain knowledge should also guide model selection.