This tutorial doesn’t shy away from explaining the ideas informally, nor does it shy away from the mathematics. It addresses both of them.

A toy example

Take for example a simple toy problem from physics diagrammed in Figure 1. Pretend we are studying the motion of the physicist’s ideal spring. This system consists of a ball of mass m attached to a massless, frictionless spring. The ball is released a small distance away from equilibrium (i.e. the spring is stretched). Because the spring is “ideal,” it oscil- lates indefinitely along the x-axis about its equilib- rium at a set frequency.

We know a-priori that if we were smart experi- menters, we would have just measured the position along the x-axis with one camera. But this is not what happens in the real world. We often do not know what measurements best reflect the dynamics of our system in question. Furthermore, we some- times record more dimensions than we actually need!

Framework: Chage of Basis

PCA computes the most meaningful basis to re-express a noisy, garbled dataset.

PCA makes one stringent but powerful assumption: Linearity. With this assumption, PCA is now limited to re-expressing the data as a linear combination of its basis vectors.