Monday, November 15, 2010

KMeans cluster testing - part the second

Now we come to the ulterior motive for this project: Locality-Sensitive Hashing. This means creating a set of regularly spaced "corners" in an N-dimensional space, and mapping all points to those corners. Think of it as slicing N-dimensional space into boxes: N-dimensional cubes are the easiest to imagine.

When I "hash" all of my training set against the "lower left corners" of the N-dimensional boxes, I get the same number of points as the original. I then clustered these "corners" via KMeans. The output looked as I would expect:


(I had to use KNime's KMeans for this chart.)



But now fun begins. When I cluster the corner points using the Canopies from the training points, I get this bizarre item:



 It clearly means something.

No comments:

Post a Comment