www.seeingstatistics.com

Multiple Regression

The appropriate statistical procedure is multiple regression. More than one independent variable is used to predict the dependent or outcome variable. Multiple regression estimates coefficients for the interept and each predictor variable in the this equation:

multiple regression equation
and provides tests of whether those coefficients are significantly different from zero.

Example

We continue with the example of the candy bar data from the simple regression example. We want to know whether Total Fat, Carbohydrates, and Sugars (all measured in grams) predict Calories.

Summary

As a group, total fat, carbohydrates, and sugars (all measured in grams) significatly predict calories in candy bars (F(3, 71) = 1280, p < .0001), accounting for almost all of the variance (R^2 = 0.98). Holding constant or controlling for differences in the candy bars in terms of carbobhydrates and sugars, each additional gram of total fat predicts a significant increase in calories (t(71) = 55.6, p < .0001). Similarly, controling for total fat and sugars, each additional gram of carbohydrate predicts a significant increase of 4.3 calories (t(71) = 25, p < .0001). Finally and perhaps surprisingly, when controlling for total fat and carbohydrates, each additional gram of sugar predicts a significant decrease of about half a calories (t(71) = -2.5, p = .014). Because sugars are a type of carbohydrate, this suggests that non-sugar carbohydrates contribute more to calories than sugars themselves.

Computer Examples

R

In this example, we assume that the candy dataset exists and has already been loaded as a table or data frame and attached, if necessary.

> candyMR <- lm(calories ~ totalFat + carbo + sugars)
> summary(candyMR)

Call:
lm(formula = calories ~ totalFat + carbo + sugars)

Residuals:
     Min       1Q   Median       3Q      Max 
-22.8271  -4.1305   0.3382   4.2962  19.6092 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.5695     4.4972   0.349   0.7281    
totalFat     10.1216     0.1820  55.628   <2e-16 ***
carbo         4.3278     0.1734  24.954   <2e-16 ***
sugars       -0.5290     0.2102  -2.516   0.0141 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 8.526 on 71 degrees of freedom
Multiple R-Squared: 0.9819,	Adjusted R-squared: 0.9811 
F-statistic:  1280 on 3 and 71 DF,  p-value: < 2.2e-16 


© 2002, Gary McClelland