Mathematical Background

From STRUCTURES Wiki
Jump to navigation Jump to search

Topological and geometrical data analysis are fundamentally built on ideas and results of algebraic topology. Main notions: simplicial complex, homology, ...

Data is always given by point clouds in [math] \mathbb{R}^n [/math]. In order to use these topological notions it is necessary to first translate the data into topological spaces. This step usually requires the notion of distance, i.e. a metric on [math] \mathbb{R}^n [/math] that is tailored to the system from which the data arises. thus enter geometry: metric spaces, ...

The distance of points in [math]\mathbb{R}^n[/math] is a notion that is usually not inherent to the data. Also there is no distinguished value of distance, so one should consider all values. The algebraic theory gives relations between different values of the parameter.

Therefore the applications of topological ideas to data analysis can almost always be divided into the following two steps

[math] \text{Data} \stackrel{geometry}{\longrightarrow} \text{simplicial complex} \stackrel{algebraic\;topology}{\longrightarrow} \text{invariants inherent to the data set} [/math]

Fundamental Definitions

Fundamental Theorems

There are three fundamental resuslts that make the classical theory of persistent homology work

Nerve Theorem

Let <math>X</math> be a paracompact space and <math>\mathcal U=\{U_i\}</math> a good open cover, that is every nonempty intersection of finitely many sets in <math>\mathcal U</math> is contractible.

Then the nerve <math>N\mathcal U</math> is homotopy equivalent to <math>X</math>.

Structure Theorem

Main article: Structure Theorem

Barcodes (or persistence modules) are complete invariants of 1-parameter filtrations of simplicial complexes

Stability Theorem

Small variations in the data or metric result in small variations of the barcode