Factor Analysis

## Here's a little tutorial on how to use the `factanal` function for Factor Analysis.

## Let's make a data frame using actual factors:
set.seed(3847)
F1 <- rnorm(100)
F2 <- rnorm(100)
X1 <- F1 + F2 + rnorm(100)/5
X2 <- 4*F1 * rnorm(100)
X3 <- -6*F1 * rnorm(100)/4
X4 <- F2 + rnorm(100)
X5 <- 4*F2 - 0.5*F1 * rnorm(100)
dat <- data.frame(X1=X1, X2=X2, X3=X3, X4=X4, X5=X5)

## Let's do a factor analysis with k=2 factors. There are two main ways you can do this.
k <- 2

## Method 1: via the data.
## Just indicate the data in the first argument, and specify the number of factors
##  in the `factors` argument.
## Note that loadings that are "essentially zero" are indicated as blanks.
factanal(dat, factors=k)
## 
## Call:
## factanal(x = dat, factors = k)
## 
## Uniquenesses:
##    X1    X2    X3    X4    X5 
## 0.375 0.005 0.947 0.487 0.035 
## 
## Loadings:
##    Factor1 Factor2
## X1  0.790         
## X2  0.160   0.985 
## X3  0.119  -0.198 
## X4  0.715         
## X5  0.976  -0.110 
## 
##                Factor1 Factor2
## SS loadings      2.128   1.024
## Proportion Var   0.426   0.205
## Cumulative Var   0.426   0.630
## 
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 0.93 on 1 degree of freedom.
## The p-value is 0.336
## Method 2: via the covariance matrix.
## Indicate the covariance matrix in the `covmat` argument. 
## Note: this method doesn't have as many features, such as hypothesis testing,
##  built into it. But the results are the same.s
factanal(factors=k, covmat=cov(dat))
## 
## Call:
## factanal(factors = k, covmat = cov(dat))
## 
## Uniquenesses:
##    X1    X2    X3    X4    X5 
## 0.375 0.005 0.947 0.487 0.035 
## 
## Loadings:
##    Factor1 Factor2
## X1  0.790         
## X2  0.160   0.985 
## X3  0.119  -0.198 
## X4  0.715         
## X5  0.976  -0.110 
## 
##                Factor1 Factor2
## SS loadings      2.128   1.024
## Proportion Var   0.426   0.205
## Cumulative Var   0.426   0.630
## 
## The degrees of freedom for the model is 1 and the fit was 0.0097
## Another important thing to note is how to do *rotations*. There are two
##  built-in rotations available via the `rotations` argument -- they are
##  "none" and "varimax" (default). So the plain factor analysis results
##  without rotation is:
factanal(dat, factors=k, rotation = "none")
## 
## Call:
## factanal(x = dat, factors = k, rotation = "none")
## 
## Uniquenesses:
##    X1    X2    X3    X4    X5 
## 0.375 0.005 0.947 0.487 0.035 
## 
## Loadings:
##    Factor1 Factor2
## X1  0.784   0.101 
## X2          0.997 
## X3  0.151  -0.175 
## X4  0.712         
## X5  0.981         
## 
##                Factor1 Factor2
## SS loadings      2.106   1.046
## Proportion Var   0.421   0.209
## Cumulative Var   0.421   0.630
## 
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 0.93 on 1 degree of freedom.
## The p-value is 0.336
## You can extract the loadings vector by extracting the list component
##  entitled `loadings`. There's a weird print call associated with the
##  object, but it's really just a matrix, as the last call indicates.
fit <- factanal(dat, factors=k, rotation = "none")
fit$loadings
## 
## Loadings:
##    Factor1 Factor2
## X1  0.784   0.101 
## X2          0.997 
## X3  0.151  -0.175 
## X4  0.712         
## X5  0.981         
## 
##                Factor1 Factor2
## SS loadings      2.106   1.046
## Proportion Var   0.421   0.209
## Cumulative Var   0.421   0.630
fit$loadings[1:5, 1:2]
##         Factor1     Factor2
## X1  0.783885737  0.10136519
## X2 -0.009679362  0.99744995
## X3  0.151115775 -0.17503806
## X4  0.711852604  0.08077992
## X5  0.980910146  0.05733956
Avatar
Vincenzo Coia
he/him/his 🌈 👨

I’m a data scientist at the University of British Columbia, Vancouver.

comments powered by Disqus