PCA

If I use all of the Principal components, I would expect the Matrix Product:

[X] = [Scores].[Loadings] to be identical to my original Data set.

On the two column x 110row set I am trying, it is not remotely near it.

How are the Loadings and Scores being calculated?

Are they being normalised wrt the zero-mean data set standard deviations?

or what?

Can anyone help?

Comments

  • Hi

    The case scores are indeed calculated from the standardised raw data (standardised to mean = 0 and for analyses using the correlation matrix to SD = 1). These are then multiplied by the Eigenvectors (the Component Score Coefficients matrix. in the statistiXL output).

    Alan
  • thanks Alan, that helped a lot, but I am still not quite with it.

    I am assuming an SVD: A = U.S.V' with S the matrix of singular values and V the eigenvector matrix.

    When I normalise the data as you say, and post-multiply it by the Eigenvector matrix, calculating A.V, I do indeed get the scores matrix, exactly as calculated by StatistiXL. Also the same eigenvalues and eigenvectors.

    And the eigenvector matrix V is indeed orthonormal, so I can get from the scores back to the Normalised data by calculating A = Scores.V'.

    But where do the loadings come in? I was under the impression that I should have:

    A = Scores.Loadings.

    Or is this not the case? The loadings matrix calculated by StatistiXL appears to be: Loadings = V.S, not Loadings = V'.

    willfoscue
  • Hi

    The factor loadings are calculated as the eigen vectors multiplied by the square root of teh eigenvalues.

    Hope this helps.

    Phil
Sign In or Register to comment.