Statistics Seminar

Variable clustering

With Nicolas Verzelen (INRA)

Variable clustering: optimal bounds and a convex approach

The problem of variable clustering is that of grouping similar components of a p-dimensional vector X = (X_1 , … , X_p), and estimating these groups from n independent copies of X. Although K-means is a natural strategy for this problem, I will explain why it cannot lead to perfect cluster recovery. Then, I will introduce a correction that can be viewed as a penalized convex relaxation of K-means. The clusters estimated by this method are shown to recover the partition G at a minimax optimal cluster separation rate.

Speaker: Nicolas Verzelen (INRA)
Friday 09 June 2017, 15:00–16:00
Venue: MR12, Centre for Mathematical Sciences, Wilberforce Road, Cambridge..
Series: Statistics; organiser: Quentin Berthet.

Add to your calendar or Include in your list