【文章內(nèi)容簡(jiǎn)介】
? ? Oneyear change in IQ score in the treatment group vs. oneyear change in IQ score in the control group. “Academic bloomers” (n=18) Controls (n=72) Change in IQ score: () () Results: points points Difference=4 points The standard deviation of change scores was in both groups. This affects statistical significance… What does a 4point difference mean? ? Before we perform any formal statistical analysis on these data, we already have a lot of information. ? Look at the basic numbers first。 THEN consider statistical significance as a secondary guide. Is the association statistically significant? ? This 4point difference could reflect a true effect or it could be a fluke. ? The question: is a 4point difference bigger or smaller than the expected sampling variability? Hypothesis testing Null hypothesis: There is no difference between ―academic bloomers‖ and normal students (= the difference is 0%) Step 1: Assume the null hypothesis. Hypothesis Testing ? These predictions can be made by mathematical theory or by puter simulation. Step 2: Predict the sampling variability assuming the null hypothesis is true Hypothesis Testing Step 2: Predict the sampling variability assuming the null hypothesis is true—math theory: ?ps)724184,0(~ 88 ??? Tc o nt r olgi f t e d ??Hypothesis Testing ? In puter simulation, you simulate taking repeated samples of the same size from the same population and observe the sampling variability. ? I used puter simulation to take 1000 samples of 18 treated and 72 controls Step 2: Predict the sampling variability assuming the null hypothesis is true—puter simulation: Computer Simulation Results Standard error is about 3. Empirical data Observed difference in our experiment = = 4. Pvalue tcurve with 88 df’s has slightly wider cutoff’s for 95% area (t=) than a normal curve (Z=) pvalue .0001 852.452.88 ????tIf we ran this study 1000 times, we wouldn’t expect to get 1 result as big as a difference of 4 (under the null hypothesis). Visually… 5. Reject null! ? Conclusion: . scores can bias expectancies in the teachers’ minds and cause them to unintentionally treat “bright” students differently from those seen as less bright. Confidence interval (more information!!) 95% CI for the difference: 177。 (.52) = ( – ) tcurve with 88 df’s has slightly wider cutoff’s for 95% area (t=) than a normal curve (Z=) What if our standard deviation had been higher? ? The standard deviation for change scores in treatment and control were each . What if change scores had been much more variable—say a standard deviation of (for both)? Standard error is Std. dev in change scores = Std. dev in change scores = Standard error is With a std. dev. of … LESS STATISICAL POWER! Standard error is If we ran this study 1000 times, we would expect to get ?+ or ?– 12% of the time. Pvalue=.12 Don’t fet: The paired Ttest ? Did the control group in the previous experiment improve at all during the year? ? Do not apply a twosample ttest to answer this question! ? AfterBefore yields a single sample of differences… ? “withingroup” rather than “betweengroup” parison… Continuous oute (means)。 Oute Variable Are the observations independent or correlated? Alternatives if the normality assumption is violated (and small sample size): independent correlated Continuous (. pain scale, cognitive function) Ttest: pares means between two independent groups ANOVA: pares means between more than two independent groups Pearson’s correlation coefficient (linear correlation): shows linear correlation between two continuous variables Linear regression: multivariate regression technique used when the oute is continuous。 gives slopes Paired ttest: pares means between two related groups (., the same subjects before and after) Repeatedmeasures ANOVA: pares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling: multivariate regression techniques to pare changes over time between two or more groups。 gives rate of change over time Nonparametric statistics Wilcoxon signrank test: nonparametric alternative to the paired ttest Wilcoxon sumrank test (=MannWhitney U test): nonparametric alternative to the ttest KruskalWallis test: nonparametric alternative to ANOVA Spearman rank correlation coefficient: nonparametric alternative to Pearson’s correlation coefficient Data Summary n Sample Mean Sample Standard Deviation Group 1: Change 72 + Did the control group in the previous experiment improve at all during the year? 2829.722271????tpvalue .0001 Normality assumption of ttest ? If the distribution of the trait is normal, fine to use a ttest. ? But if the underlying distribution is not normal and the sample size is small (rule of thumb: n30 per group if not too skewed。 n100 if distribution is really skewed), the Central Limit Theorem takes some time to kick in. Cannot use ttest. ? Note: ttest is very robust against the normality assumption! Alternative tests when normality is violated: Nonparametric tests Continuous oute (means)。 Oute Variable Are the observations independent or correlated? Alternatives if the normality assumption is violated (and small sample size): independent correlated Continuous (. pain scale, cognitive function) Ttest: pares means between two independent groups ANOVA: pares means between more than two independent groups Pearson’s correlation coefficient (