/* ------------------------------------------------------------------- File: factot.nmt.sas Purpose: Illustrate factor analysis ------------------------------------------------------------------- */ OPTIONS NOCENTER NODATE PAGENO=1; * - NOTE: You must assign the LIBNAME p7291 to the directory/folder contain the data set NMTwins; * LIBNAME p7291 ''; TITLE NATIONAL MERIT TWIN STUDY; TITLE2 Factor Analysis of the National Merit Scholarship Test ; DATA temp; SET p7291.nmtwins; RUN; /* --- standardize by sex to remove sex differences * first we must sort the data by sex */ PROC SORT; BY sex; RUN; /* next we use PROC STANDARD to standardize the data. * the following are the arguments to the PROC STANDARD command: * DATA = name of the input data set. * OUT = name of the output data set. You should always * specify a different name to the data. If you use the * same name, then PROC STANDARD will replace the * existing variables with the standardized ones. * MEAN = value of the mean for the standardized variables * STD = value of the standard deviations for the standardized * variables. * If you want Z scores, then specify MEAN=0 and STD=1 */ PROC STANDARD DATA=temp OUT=temp2 MEAN=50 STD=10; BY sex; VAR english--vocab; RUN; /* --------------------------------------------------------------- The first step in a factor analysis is usually to determine the number of factors. Here we will run the factor analysis using the SCREE option to plot the eigenvalues. We do not need to rotate or compute factor scores in this step. The other options are: DATA name of the input data set CORR Print the corelation matrix EV Print the eigenvectors SCREE Give a scree plot -------------------------------------------------------------------*/; PROC FACTOR DATA=temp2 CORR EV SCREE; VAR english--vocab; RUN; /* --------------------------------------------------------------- The previous analysis suggests a single general factor. Could the second factor still be meaningful? Let's extract and rotate the second factor to see if it is interpretable. New options are: N=2 Extract two factors METHOD=prin Perform a principal components analysis ROTATE=promax Perform a promax rotation -------------------------------------------------------------------*/; PROC FACTOR DATA=temp2 N=2 METHOD=prin ROTATE=promax; VAR english--vocab; RUN; /* --------------------------------------------------------------- We accept the two factor solution and now want to create a new data set with the factor scores. New options are: SCORE Compute factor scores (must have the N= option) OUT=temp3 name of the output data set is temp3 -------------------------------------------------------------------*/; PROC FACTOR DATA=temp2 N=2 METHOD=prin ROTATE=promax SCORE OUT=temp3; VAR english--vocab; RUN; * - Rename the two factors; DATA temp3 (RENAME=(Factor1 = VerbalFac Factor2=MathFac)); SET temp3; RUN;