Achtioplas (2001) claims that random projection does not need fully distributed values: +1/-1 chosen with linear random suffices. Even better, a linear distribution of [0, 0, 0, 0, sqrt(3), -sqrt(3)] also works. This throws away 4 out of 6 input values.

This post explores applying these four different distributions to the same random projection. To recap, here is the best I've got: a full MDS projection from 200 dimensions to 2 dimensions.

Here are four versions of the same dataset, reducing 200 dimensions to 2 dimensions via random projection:

Gaussian | +1/-1 |

Sqrt(3) | Linear |

The full MDS version is certainly the most pleasant to look at. The Gaussian and 2 distributions from Antiochplas all seem to be different rotations of a cylinder. The Linear distribution is useless in this situation.

Given these results, to do a quick visualization of your data, I would try all four random distributions; you may get lucky like I did with somewhat similar rotations. And, I recommend the colorizing trick; it really helps show what's going on here. After that, I would get a good dimensional reduction algorithm and do the 2-stage process:

hi-dimensional -> RP -> low-d -> formal dimensional reduction -> 2d or 3d.

On to part 3 for a discussion of noise.

**Achlioptas, 2001***Database-friendly random projections: Johnson-Lindenstrauss with binary coins*

PDF available online at various places:

I did these diagrams with the KNime visual programming app for data mining. All Hail KNime!

## No comments:

## Post a Comment