Introduction to Regularization

What is regularization? Regularization, as it is commonly used in machine learning, is an attempt to correct for model overfitting by introducing additional information to the cost function. In this post we will review the logic and implementation of regression and discuss a few of the most widespread forms: ridge, lasso, and elastic net. For simplicity, we’ll discuss regularization within the context of least squares linear regression, and I assume that you have some familiarity with linear regression. Onward! Continue reading “Introduction to Regularization”


Short Introduction to PCA

In Principal Component Analysis (PCA), we would like to convert our high-dimensional dataset onto a lower-dimensional space while keeping as much information as possible. Typically, this is done to avoid curse of dimensionality effects or for the purposes of data visualization.

In broad strokes, PCA reduces the dimensionality of our dataset in a way that minimizes (certain aspects of) the amount of information we throw away by projecting our p-dimensional feature set onto a lower-dimensional subspace. Continue reading “Short Introduction to PCA”