【正文】
linear? 2. Calculate r2, the square of the correlation coefficient 3. Examine residual plot r2 : The Variation Accounted For ? The square of the correlation coefficient r gives important information about the usefulness of the least squares line r2: important information for evaluating the usefulness of the least squares line The square of the correlation coefficient, r2, is the fraction of the variation in y that is explained by the least squares regression of y on x. 1 ≤ r ≤ 1 implies 0 ≤ r2 ≤ 1 The square of the correlation coefficient, r2, is the fraction of the variation in y that is explained by the variation in x. Example: car weight, fuel consumption ? x=car weight, y=fuel consumption r2 = (.9766)2 ? .95 About 95% of the variation in fuel consumption (y) is explained by the linear relationship between car weight (x) and fuel consumption (y). ? What else affects fuel consumption? –Driver, size of engine, tires, road, etc. Example: SAT scores SA T M e a n p e r Sta te v s % Se n i o r s T a k i n g T e s ty = 2 . 2 3 7 5 x + 1 0 2 3 . 4R2 = 0 . 7 5 4 28208709209701020107011200 10 20 30 40 50 60 70 80% o f S e n i o r s T a k i n g T e s tMean SAT ScoreSAT scores: calculations 1 0 1103 3 .8 8 2 2 4 .1 0 3 9 4 7 .5 4 9 6 2 .1 .8 6 8,6 2 .1sl o p e .8 6 8 2 .2 3 6 3 52 4 .1 0 3in te r c e p t 9 4 7 .5 4 9 ( 2 .2 3 6 ) 3 3 .8 8 2 1 0 2 3 .3 0 9?l e a st sq u a r e s p r e d ic tio n l in e 1 0 2 3 .3 0 9 2 .2 36xyyxx s y s rsb r b y b xsbbyx? ? ? ? ? ?? ? ?? ? ? ?? ? ? ???SAT scores: result SA T M e a n p e r Sta te v s % Se n i o r s T a k i n g T e s ty = 2 . 2 3 7 5 x + 1 0 2 3 . 4R2 = 0 . 7 5 4 28208709209701020107011200 10 20 30 40 50 60 70 80% o f S e n i o r s T a k i n g T e s tMean SAT Scorer2 = ()2 = .7534 If 57% of NC seniors take the SAT, the predicted mean score is ? 1 0 2 3 . 3 0 9 2 . 2 3 6 3 5 ( 5 7) 8 9 5 . 8 4y ? ? ?Avoid GIGO! Evaluating the least squares line 1. Create scatterplot. Approximately linear? 2. Calculate r2, the square of the correlation coefficient 3. Examine residual plot Residuals ? residual =observed y predicted y = y y ? Properties of residuals 1. The residuals always sum to 0 (therefore the mean of the residuals is 0) 2. The least squares line always goes through the point (x, y) Graphically residual = y y y yi yi ei=yi yi X xi Residual Plot ? Residuals help us determine if fitting a least squares line to the data makes sense ? When a least squares line is appropriate, it should model the underlying relations