* ----------------------------------------------------------------- * file: alzheimers.sas * * This is an example of a within subjects (or repeated measures) * ANOVA that is somewhat complicated in design. The problem * is whether a test drug improves the memories of Alzheimer * patients. Thirty patients were randomly assigned to three * treatment groups: (1) placebo or 0 milligrams (mg) of * active drug, (2) 5 mg of drug, and (3) 10 mg. The memory * task consisted of two modes of memory (Recall vs Recognition) * on two types of things remembered (Names versus Objects) * with two frequencies (Rare vs Common). That is, the patients * were gives a list consisting of Rare Names (Waldo), * Rare Objects (14th century map), Common Names (Bill), and * Common Objects (fork) to memorize. The ordering of these * four categories within a list was randomized from patient to * patient. Half the patients within each drug group were asked * to recall as many things on the list as they could. The * other half were asked to recognize as many items as possible * from a larger list. A new list was presented to the patients, * and the opposite task was performed. That is, those who had * first been given the recall task were then given the recognition * task, and those initially given the recognition task were then * given the recall task. * Assume that previous research with this paradigm has shown * that there are no order effects for the presentation. * (This assumption is for simplicity only--it could easily * be modelled.) * The dependent variables are "memory scores." High scores * indicate more items remembered. * * On the raw data file, the data are organized in the following * way: Recall Recognition Names Objects Names Objects Obs Drug Rare Comm Rare Comm Rare Comm Rare Comm 1 1 63 81 93 96 92 106 99 102 2 1 54 53 60 70 69 90 64 83 . 30 3 72 85 86 85 91 91 94 99 * Drug condition is a between subjects factor with three levels. * There are three within subjects factors (or, if you prefer, * three repeated measures factors), each with two levels. * * The following SAS program shows one way of dealing with these * data using GLM with the CONTRAST statement to code the * three drug groups and the REPEATED statement to deal with the * within subjects (or repeated measures) factors. * ------------------------------------------------------------; options pagesize=54 linesize=76; title 'Psychology 7291: Within Subjects Design: Alzheimer example'; * ------------------------------------------------------------ * Set up formats for output. This is really bells and whistles * and is not required. It just makes for easy reading of some * of the output. * ------------------------------------------------------------; proc format; value drugfmt 1=' 0 mg' 2=' 5 mg' 3='10 mg'; value memmfmt 1='Recall' 2='Recognition'; value tyfmt 1='Names' 2='Objects'; value frfmt 1='Rare' 2='Common'; run; * ------------------------------------------------------------ * Read the raw data into data set alzheimr. The dependent * variables are mneumonic: r=rare, c=common, n=name, o=object, * rl=recall, and rg=recognition. Thus, r_n_rl = persons score * for rare_name_recall and c_o_rg = persons score for * common_object_recognition. It is often helpful to use * mneumonics like this so you do not have to spend time * looking at variable labels. Note how the order of the * variables corresponds to the design given above. * ------------------------------------------------------------; data alzheimer; length subject drug r_n_rl c_n_rl r_o_rl c_o_rl r_n_rg c_n_rg r_o_rg c_o_rg 3; input subject drug r_n_rl c_n_rl r_o_rl c_o_rl r_n_rg c_n_rg r_o_rg c_o_rg; format drug drugfmt.; DATALINES; 1 1 62 72 65 82 75 82 80 99 2 1 53 72 64 89 78 84 92 98 3 1 57 74 59 85 65 94 86 96 4 1 40 56 55 68 70 80 69 86 5 1 59 59 50 70 74 77 79 95 6 1 64 66 59 83 86 86 67 87 7 1 65 68 78 98 78 99 87 96 8 1 31 49 38 44 50 60 51 68 9 1 55 57 71 58 68 86 70 94 10 1 68 81 77 96 97 100 94 92 11 2 71 88 80 89 92 93 99 107 12 2 58 55 57 85 67 79 76 87 13 2 64 75 88 79 82 85 67 99 14 2 68 75 68 78 81 99 73 91 15 2 80 90 69 97 94 101 111 104 16 2 60 79 70 76 83 96 80 87 17 2 70 73 71 72 82 84 79 93 18 2 54 54 50 71 58 84 85 87 19 2 65 75 54 66 82 84 75 84 20 2 67 65 48 67 63 63 66 75 21 3 74 99 81 106 105 107 98 108 22 3 75 70 70 84 86 90 77 90 23 3 84 81 71 96 98 111 96 112 24 3 88 89 95 83 107 105 103 93 25 3 88 106 88 103 94 120 93 106 26 3 93 81 78 90 102 117 105 99 27 3 84 90 74 96 109 110 95 119 28 3 57 62 63 61 64 90 82 89 29 3 67 94 71 90 93 110 77 109 30 3 72 86 61 68 94 91 90 90 ; run; * ------------------------------------------------------------ * Create another data set in order to compute all the means * and standard deviations for the various combinations of * the within subjects factors. Note that we are converting * the SAS data set from one in which the rows are all * independent observations to one in which subjects are * "entered" multiple times. This is done because it is * often easier to get the marginal means this way. Pay * attention to the ordering of the DO statements. They * must be ordered in the way in which the dependent * variables are input. * ------------------------------------------------------------; data descrip; set alzheimer; array dv [8] r_n_rl--c_o_rg; n=0; do memmode=1 to 2; do type=1 to 2; do frequcy=1 to 2; n=n+1; score = dv[n]; output; end; end; end; keep subject drug memmode type frequcy score; label memmode = 'Memory mode: Recall or Recognition' type = 'Type: Names or Objects' frequcy = 'Frequency: Rare or Common' score = 'Number correct'; format memmode memmfmt. type tyfmt. frequcy frfmt.; run; * ------------------------------------------------------------ * PROC SUMMARY is a quick way to get a lot of summary stats * for different groups. It also gives marginal means for * every type of combination for the within subjects factors. * It is a tedious procedure to learn, but in the long run, it * can be worth the effort. The end result is an output data * set called STATS that contains the means (variable MEAN) * and the standard deviations (variable STD) for the different * drug conditions and within subjects factors. The CLASS * statement is meant to break down the mean into all possible * combinations of MEMMODE, TYPE, FREQUCY, and DRUG. DRUG is * placed last so that the output is arranged in an order that * allows easy comparison of placebo, 5 mg, and 10 mg groups. * ------------------------------------------------------------; proc summary data=descrip; class memmode type frequcy drug; var score; output out=stats mean=mean std=stddev; run; * ------------------------------------------------------------ * Now print the results from PROC SUMMARY. * In the output a period (.) for a * variable means "collapsed over all categories of the * variable." Thus, the first row, having all periods, * gives the mean and standard dev for the memory scores collapsed * over all values of MEMMODE, TYPE, FREQUCY, and DRUG. This is * the grand mean and standard deviation for the whole sample. * The next row is the mean for the control group over all the * repeated measures variables, etc. There are a lot of means! * ------------------------------------------------------------; proc print data=stats; var memmode type frequcy drug mean stddev; run; * ------------------------------------------------------------ * Finally, GLM is used to test hypotheses about the * drug effect. There are two contrasts. The first compares * the placebo group with the mean of the active drug group. * This is a test for the efficacy of the drug. The second * contrast compares the 5 mg dose with the 10 mg dose to * see if there is a dose-response effect. The REPEATED * statement is used to test for the within subjects (or * repeated measures) effects. Of particular interest here * is whether there is any interaction of repeated measures * effects with the drug. That is, does the drug work to * improve some aspects of memory more than others? * The results for the univariate ANOVAs for the * raw data are suppressed by specifying the NOUNI option on * the MODEL statement. Also, all the multivariate results for * the repeated measures analysis are supressed by specifying * the NOM option on the repeated statement. Since no * transformation is requested, SAS will use a CONTRAST * transform (which is NOT a contrast code). There is no need * to perform a test of sphericity in the data. Can you figure * out why? (PS. It is not because I have supressed the * multivariate analysis.) * * Note the order of the effects on the REPEATED statement. * The effect that changes the slowest is first, and the * effect that changes the fastest is last. * * NOTE WELL: If the variables in a CLASS statement have * formats associated with them, the ordering of the levels * may NOT be the same as the numeric ordering. Hence, if you * use CONTRAST or ESTIMATE statements in SAS, always check * the first page of GLM's output to make certain the levels * of an ANOVA factor are ordered consistently with the * levels specified in the CONTRAST or ESTIMATE statement. * ------------------------------------------------------------; proc glm data=alzheimer; class drug; model r_n_rl--c_o_rg = drug / nouni; contrast 'placebo vs drug' drug -2 1 1; contrast 'dose of drug' drug -1 0 1; repeated memmode 2, type 2, frequcy 2 / printe printm summary; run;