Mathematical Background
Topological and geometrical data analysis are fundamentally built on ideas and results of algebraic topology. Main notions: simplicial complex, homology, ...
Data is always given by point clouds in [math] \mathbb{R}^n [/math]. In order to use these topological notions it is necessary to first translate the data into topological spaces. This step usually requires the notion of distance, i.e. a metric on [math] \mathbb{R}^n [/math] that is tailored to the system from which the data arises. thus enter geometry: metric spaces, ...
The distance of points in [math]\mathbb{R}^n[/math] is a notion that is usually not inherent to the data. Also there is no distinguished value of distance, so one should consider all values. The algebraic theory gives relations between different values of the parameter.
Therefore the applications of topological ideas to data analysis can almost always be divided into the following two steps
[math] \text{Data} \stackrel{geometry}{\longrightarrow} \text{simplicial complex} \stackrel{algebraic\;topology}{\longrightarrow} \text{invariants inherent to the data set} [/math]
Fundamental Definitions
Fundamental Theorems
There are three fundamental resuslts that make the classical theory of persistent homology work
Nerve Theorem
Let [math]X[/math] be a paracompact space and [math]\mathcal U=\{U_i\}[/math] a good open cover, that is such that every nonempty intersection of finitely many sets in [math]\mathcal U[/math] is contractible.
Then the nerve [math]N\mathcal U[/math] is homotopy equivalent to [math]X[/math].
Structure Theorem
Main article: Structure Theorem
Barcodes (or persistence modules) are complete invariants of 1-parameter filtrations of simplicial complexes
Stability Theorem
Small variations in the data or metric result in small variations of the barcode