Now we come to the ulterior motive for this project: Locality-Sensitive Hashing. This means creating a set of regularly spaced "corners" in an N-dimensional space, and mapping all points to those corners. Think of it as slicing N-dimensional space into boxes: N-dimensional cubes are the easiest to imagine.
When I "hash" all of my training set against the "lower left corners" of the N-dimensional boxes, I get the same number of points as the original. I then clustered these "corners" via KMeans. The output looked as I would expect:
(I had to use KNime's KMeans for this chart.)
But now fun begins. When I cluster the corner points using the Canopies from the training points, I get this bizarre item:
It clearly means something.
No comments:
Post a Comment