Statistical Learning Theory: VC Dimension, Structural Risk Minimization

Sometimes our models overfit, sometimes they overfit.

A model’s capacity is, informally, its ability to fit a wide variety of functions. As a simple example, a linear regression model with a single parameter has a much lower capacity than a linear regression model with multiple polynomial parameters. Different datasets demand models of different capacity, and each time we apply a model to a dataset we run the risk of overfitting or underfitting our data.

Continue reading “Statistical Learning Theory: VC Dimension, Structural Risk Minimization”

The Box-Cox Transformation

The Box-Cox transformation is a family of power transform functions that are used to stabilize variance and make a dataset look more like a normal distribution. Lots of useful tools require normal-like data in order to be effective, so by using the Box-Cox transformation on your wonky-looking dataset you can then utilize some of these tools.

Here’s the transformation in its basic form. For value x and parameter \lambda:

\displaystyle \frac{x^{\lambda}-1}{\lambda} \quad \text{if} \quad x\neq 0 

\displaystyle log(x) \quad \text{if} \quad x=0

Continue reading “The Box-Cox Transformation”