I'm using R to get the principal components for several datasets.
An example result, using prcomp yields:
PC1 PC2 X1 -0.7071068 -0.7071068 X2 -0.7071068 0.7071068
I'm then using the first principal component, from each dataset, as an input into another model (well, I plan to).
The problem is that, for interpretability, the direction of the first principal component matters. This is because I'll be multiplying my data by the loadings, and the direction changes the final result. (This result I also need to report).
If I get new data, and rerun PCA, it's possible that the prcomp result could be:
PC1 PC2 X1 0.7071068 -0.7071068 X2 0.7071068 0.7071068
And if I multiply by the loadings now, the result will be the additive inverse. I could just take the absolute value of the components of the first eigenvector, but this feels wrong... and also, I'm not sure about extending it to cases with more than 2 components.
(I know that if the first principal component was (0.7071068, 0.7071068), mathematically, the result would be the same; most variance explained along that line since eigenvectors multiplied by scalars are still eigenvectors.)
I'm curious if anyone has come across a similar problem, or if there is a technique for dealing with such issues?
For now, I am manually changing the direction of the first eigenvector on a case by case basis, if needed.
Any suggestions appreciated :)