A K-Means++ Clustering Implementation for VTK

K-Means clustering is an excellent technique for clustering points when the number of clusters is known. We present a implementation (vtkKMeanClustering) of the algorithm written in a VTK context. We also implement the K-Means++ initialization method which finds the global optimum much more frequently than a naive/random initialization.

The code is currently hosted at http://github.com/daviddoria/KMeansClustering .

minus Excellent Contribution! by Arnaud Gelas on 2010-09-28 13:39:32 for revision #1
starstarstarstarstar expertise: 5 sensitivity: 5

A Must Have!!!

Free comment :
Once again: excellent work!

I have to admit that I have not read the paper yet, but I had a quick look to the code...

Using a kd-tree to find closest points will significantly speed up the implementation 

whenever the number of clusters is quite large.

I would also recommend that you abstract the metric and the way to compute the centroid, 

like that if someone wants to use L_1 or another metric (for example which makes use of normals) 

he won't need to duplicate too much code.




Quick Comments

Download Paper , View Paper

Categories: Iterative clustering, PointSet
Keywords: clustering, Kmeans, kmeans++
