Sunday, July 10, 2011

Dimensional Reduction via Random Projection, Part 4: Better Living Through Rotation!

Thanks to a tip from Ted Dunning, I have rotated the four distributions via factorizing the distributions.



Gaussian+1/-1


Sqrt3Linear

These look mighty funky. I've taken this as far as I can without being able to browse the item metadata (movie names).

Friday, July 1, 2011

Dimensional Reduction via Random Projection, Part 1: Huh?

This recounts a short research project in using random projection for visualization.

I started with vectors which represent items in a data model. (See my Semantic Vectors posts.) They are from 100k entries in the GroupLens film ratings dataset, representing ratings by 669 users on 3274 items. The 200 vectors are semi-randomly chosen from the item pool. All are 200 dimensions, dense. Euclidean distances signify nearness for Item-Item recommendation. The item-item distances give good ratings in the 150-200 dimension range, so it is reasonable to pick 200 dimensions as having rich support. The subset of item vectors have strong user-user bonds, so the recommendation structure is somewhat visible. This was a mistake, but I like it.

The project uses simple 2D visualization to decide whether random projection is as good as other, older methods of dimensional reduction. I used KNime's MDS implementation. MDS (Multi-Dimensional Scaling) is an older algorithm for creating a projection from a high dimensional space to a low dimensional, retaining pairwise distances. MDS is tuneable for the amount of information dropped.

Each test run uses the same 200 vectors in the same order. Each vector has the same color each time (from vivid green to black to vivid red).

The main axis of investigation is between three projection methods: complete Random Projection, partial Random Projection, and complete MDS. The following scatter charts show the vectors transformed to 2d vectors, and shown in a scatter-plot (X and Y are generated dimensions, color is the vector).
  1. Random projection from 200 dimensions to 2 dimensions.
  2. Random projection from 200 dimensions to 20 dimensions, then use MDS to boil down to 2 dimensions.
  3. Use MDS to convert from 200 dimensions to two dimensions.
1.
Full Random Projection from 200 dimensions to 2 dimensions (Gaussian)

2.
Random Projection from 200D to 20D, then MDS to 2D (Gaussian, KNime MDS implementation)

3.
Full MDS from 200d to 2d (KNime MDS Implementation)

Conclusions:
  • Using full random projection does not completely trash the data, but it's challenging to interpret.
  • Downsizing the data to 20 dimensions and then using MDS the rest of the way is somewhat cleaner, and good enough to work with. It's also faster.
  • Full MDS shows a strikingly clear picture of the item-item structure.
With later work it has become clear that these are the same projection, rotated. Later I would like to reduce to 3D and do animations of rotating these vector sets.

Onward to Part 2: there is more than one random projection.

I did these diagrams with the KNime visual programming app for data mining.  All Hail KNime!

Dimensional Reduction via Random Projection, Part 2: Distributions

There is more than one random distribution, and there are some surprises in store.

Achtioplas (2001) claims that random projection does not need fully distributed values: +1/-1 chosen with linear random suffices. Even better, a linear distribution of [0, 0, 0, 0, sqrt(3), -sqrt(3)] also works. This throws away 4 out of 6 input values.

This post explores applying these four different distributions to the same random projection. To recap, here is the best I've got: a full MDS projection from 200 dimensions to 2 dimensions.


Here are four versions of the same dataset, reducing 200 dimensions to 2 dimensions via random projection:



Gaussian+1/-1


Sqrt(3)Linear

The full MDS version is certainly the most pleasant to look at. The Gaussian and 2 distributions from Antiochplas all seem to be different rotations of a cylinder. The Linear distribution is useless in this situation.

Given these results, to do a quick visualization of your data, I would try all four random distributions; you may get lucky like I did with somewhat similar rotations. And, I recommend the colorizing trick; it really helps show what's going on here. After that, I would get a good dimensional reduction algorithm and do the 2-stage process:

        hi-dimensional -> RP -> low-d -> formal dimensional reduction -> 2d or 3d.

On to part 3 for a discussion of noise.

Achlioptas, 2001
Database-friendly random projections: Johnson-Lindenstrauss with binary coins

PDF available online at various places:
I did these diagrams with the KNime visual programming app for data mining.  All Hail KNime!

    Dimensional Reduction via Random Projection, Part 3: Noise

    Now for a third axis of investigation: distances.

    Random Projection preserves pairwise distances; this is its claim to fame. To measure this, I created matrices of distances before and after "full RP": 200-d -> RP -> 2d. Here are spreads of distances: each color is that vector's distances: the X axis is the 200-d distances, and the Y axis is the 2d distances.


    Gaussian+1/-1






    Sqrt(3)Linear

    From tightest to loosest spreads, it's Linear, +1/-1, Sqrt(3) and Gaussian. Linear is so clean because it is overconstrained in this example. +1/-1 and Sqrt(3) are useable, and Gaussian looks like a bomb with smallpox. The +1/-1 and Sqrt(3) projectors look best for this case. If I wanted to work harder, I would compare the distances as matrices, find matrix norms, get standard deviations of the distances, etc. But for visualization, these worked well.

    I did these diagrams with the KNime visual programming app for data mining.  All Hail KNime!