from IPython.display import Image Image("SlidesGudhi/GeneralPipeLine_Boot.png")
In this third part of the tutorial we introduce bootstrap procedures for peristence homology. We start with the case of confidence regions for persistence homology of filtrations of simplicial complexes directly defined on point clouds.
import numpy as np import pandas as pd import pickle as pickle import gudhi as gd from pylab import * import seaborn as sns from mpl_toolkits.mplot3d import Axes3D from IPython.display import Image from sklearn.model_selection import ShuffleSplit from sklearn.neighbors import KDTree from sklearn.neighbors.kde import KernelDensity import ipyparallel as ipp %matplotlib inline
We will need additional functionalities for ploting confidence regions for persistence homology (coming in the next releases of Gudhi).
Download the python file persistence_graphical_tools_Bertrand.py and save it in your working repository (or in your python path).
from persistence_graphical_tools_Bertrand import *
We illustrate the bootstrap procedure for the crater dataset with a filtration of alpha Complexes.
f = open("crater_tuto","rb") crater = pickle.load(f) f.close()
sns.kdeplot(crater, shade = True, cmap = "PuBu",bw=.3)
We define a filtration of alpha Complexes (it takes a few seconds)
Alpha_complex_crater = gd.AlphaComplex(points = crater) Alpha_simplex_tree_crater = Alpha_complex_crater.create_simplex_tree(max_alpha_square=2) diag_crater = Alpha_simplex_tree_crater.persistence()
For many applications of persistent homology, we observe many topological features closed to the diagonal.
Since they correspond to topological structures that die very soon after they appear in the filtration, these points are generally considered as noise. We will see that confidence regions for persistence diagram provide a rigorous framwork to this idea.
Representing in the diagram all the topological features is not relevant since most of them have very short persistence. Moreover, ploting all the points takes too much time. We want to select only the more persistent features of the filtration.
Confidence regions for persistence diagram provide a rigorous framework for selecting significant topological features in a persistence diagram.
We use the bottleneck distance $d_b$ to define confidence regions.