Clustering of Big Data

With Riccardo Cristoferi, Heriot-Watt University

Clustering of Big Data: consistency of a nonlocal Ginzburg-Landau type model

The analysis of Big Data is one of the most important challenges of the modern era. A first step in order to extract some information from a set of data is to partition it according to some notion of similarity. When only geometric features are used to define such a notion of similarity and no a priori knowledge of the data is available, we refer to it as the clustering problem.

Typically this labelling task is fulfilled via a minimization procedure. Of capital importance for evaluating a clustering method is whether it is consistent or not; namely it is desirable that the minimization procedure approaches some limit minimization method when the number of elements of the data set goes to infinity.

In this talk the consistency of a nonlocal anisotropic Ginzburg-Landau type functional for clustering is presented. In particular, it is proved that the discrete model converges, in the sense of Gamma-convergence, to a weighted anisotropic perimeter.

The talk is based on a work in collaboration with Matthew Thorpe (Cambridge University).

Add to your calendar or Include in your list