Genetic modeling in SAS
Martin Escher, Drawing Hands
How do I create a SASPairs Project?
A SASpairs project is a collection of SAS data sets (technically, a SAS catalog). They include the observed data to be analyzed, the SASPairs source code for all the models fitted to those data, and the results of all model fits. As a result of having all of this material in one place, there is no need to figure out if that the "revised general factor model" was on file "genfac.2xout" or "genfac.6aout3." You can recall a SASpairs project, look at the titles of all the models fitted to the data, select the model you want to examine, and then peruse the fit statistics, parameter estimates, and matrices for that model.
A SASPairs project can be defined using the point-and-click interface. There is also an extensive help system to assist in the creation of a project.
- Start SASPairs (how do I do this?).
- From the SASPairs Main Menu, select Run Interactive SASPairs.
- In the next frame, select Create a New SASPairs Project and click Ok. The subsequent frame will ask you to give a name and a short description to the project. Then, you will select a folder to contain the project. (Depending on the interface that you selected when you started SASPairs, you may be asked to provide a short "nickname" (i.e., SAS Libname) for the project.)
- After the name and location of the folder is defined, you will be asked whether the data set is a Phenotypic Data Set (i.e., it contains the raw scores for individuals) or a TYPE=CORR data set (i.e., it contains covariance matrices). See the SASPairs Reference Manual for more information.
- You will then encounter the frame shown below that gives the Dataset Definitions for the Project (i.e., the observed data and variables for the project). Begin by clicking on Browse to navigate to and select the SAS data set containing the observed data.
- After selecting the SAS data set, click on the Browse buttons to select the variables from the data set required to define the project. These are: Family ID Variable, Relationship V ariable, Phenotypes for Analysis, and (optionally) Covariates. Click the Help button to view definitions of these variables. The frame below illustrates the effect of clicking the Browse button for the Family ID Variable.
- After selecting all of the above-named variables, click on the Browse button for the Relationship Data Set. (You can refer to the SASPairs Reference Manual or click on the Help button in the frame illustrated in instruction (5) above to get more information about a Relationship Data Set.) The screen below appears. Here, you can select from "canned" SASPairs Relationship Data Sets, or you can select one that you had created and saved, or you can create a new one.
- The screen below illustrates some of the canned Relationship Data Sets in SASPairs.
- One of the biggest choices you may have to make at this stage is whether you want to treat pairs as being in an intraclass or interclass relationship. Generally, intraclass relationships are suitable for collateral relatives (siblings, cousins) while interclass relationships are appropriate for vertical relatives (e.g., parent-offspring). The decision should also be influenced by hypotheses about sex differences in variances and covariances (or the lack of them). The frame given below illustrates a SASPairs Relationship Data Set that treats same-sex twins as being in an intraclass relationship while opposite-sex twins are in an interclass relationship.
- Once you are satisfied with the Dataset Definitions for the SASPairs Project, click the Submit button. The frame below shows the Dataset Definitions for a project from the National Merit Twins. The phenotypes for analysis are the first five empirical scales (Dominance through Self-Acceptance) of the California Psychological Inventory (CPI).
- The very last selection is the type of data to which the model will be fitted. (The frame is given below). The decision about fitting models to covariance matrices or to raw data should depend on the extent of missing data in the data set. Covariance matrices are appropriate for complete data or data with only a few, sporadic missing values. Otherwise, fit models to the raw data.
- After this selection, SASPairs will read in the data and rearrange it in an efficient manner for subsequent processing. You should examine the SAS Log to make certain there were no errors or significant warnings issued.
- The SASPairs Project has now been created. (At least, the Dataset Definitions have been created and saved.) The next step is fit models to the data. In the frame shown below, select Fit A New Model, then click "Ok."