[mlpack] Rotate new data with Kernel PCA

Wed Apr 2 19:38:31 EDT 2014

Hi Ryan,

Thanks for your answer.  I figured it should be possible.  I'm not very
experienced with Support Vector Machines, but wouldn't they have to do
this sort of thing to predict for new data?

P.S. Good quote in your .sig from "Dr. Strangelove", a classic film. 

-- Dave Slate

On Wed, 2 Apr 2014 19:05:08 -0400 Ryan Curtin <gth671b at mail.gatech.edu> wrote:

> On Wed, Apr 02, 2014 at 10:45:38PM +0000, dslate at speakeasy.net wrote:
> > Hi,
> > 
> > I am quite experienced with predictive analytics and machine learning, but
> > I'm a newbie to mlpack and this mailing list, so please forgive me if I'm
> > not posting my question to the right place. 
> > 
> > I would like to call mlpack's kpca::KernelPCA facility from a C++ program
> > to perform kernel PCA analysis on the feature matrix for some "training"
> > data for the purposes of dimensionality reduction, and then apply the
> > resulting rotation on some new "test" data.  I've done this kind of thing
> > with regular (linear) PCA in R, OpenCV, etc., using a "predict" or
> > "project" method, but I have been unable to figure out how to do the
> > equivalent operation using mlpack, and I can't seem to find any examples
> > of this.  The documentation of the various versions of the Apply method
> > seem to all involve doing the KernelPCA analysis on some data, and then
> > transforming the same data, but I see no way to apply the results to new
> > data. 
> > 
> > Can anyone give an example of how to do this?
> 
> Hi Dave,
> 
> What you're trying to do is apply the nonlinear mappings of kernel PCA
> to data other than what the kernel matrix was calculated on.  For
> regular PCA, this is easily possible because the eigenvectors are
> calculated in the input space that your points live in.  Then you just
> use those eigenvectors and project your new data onto it.  This is
> pretty straightforward in mlpack. 
> 
> However, for kernel PCA, this is less simple.  Kernel PCA
> eigendecomposes a matrix that is built in the kernel space, not the
> input space.  So, in general, you can't take the eigenvectors produced
> by that eigendecomposition and multiply them to your new data to get
> nonlinearly mapped test data. 
> 
> I think that what you are trying to do is possible, and detailed in this
> paper:
> 
> https://papers.nips.cc/paper/2461-out-of-sample-extensions-for-lle-isomap-mds-eigenmaps-and-spectral-clustering.pdf
> 
> Unfortunately, mlpack doesn't have that support implemented.  From what
> it looks like, kernlab in R does support this functionality.  I am not
> sure exactly what they have implemented to map new points, but it's
> probably the same thing as the paper above (or a variant thereof).  I'll
> probably open a bug in the next few days to implement that support, but
> it almost certainly won't be implemented in the short-term... 
> 
> Sorry that I don't have a better answer or solution to your problem. :-\
> 
> Thanks,
> 
> Ryan
> 
> -- 
> Ryan Curtin    | "Gentlemen, you can't fight in here!  This is the
> ryan at ratml.org | War Room!" - President Muffley