Cambridge University Press
052183550X - Statistics Explained - An Introductory Guide for Life Scientists - by Steve McKillup
Frontmatter/Prelims


Statistics Explained

Statistics Explained is a reader-friendly introduction to experimental design and statistics for undergraduate students in the life sciences, particularly those who do not have a strong mathematical background. Hypothesis testing and experimental design are discussed first. Statistical tests are then explained using pictorial examples and a minimum of formulae. This class-tested approach, along with a well-structured set of diagnostic tables, will give students the confidence to choose an appropriate test with which to analyse their own data sets. Presented in a lively and straightforward manner Statistics Explained will give readers the depth and background necessary to proceed to more advanced texts and applications. It will therefore be essential reading for all bioscience undergraduates, and will serve as a useful refresher course for more advanced students.

Steve McKillup is an Associate Professor of Biology in the School of Biological and Environmental Sciences at Central Queensland University, Rockhampton.


Statistics Explained

An Introductory Guide for Life Scientists

STEVE McKILLUP


CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo

CAMBRIDGE UNIVERSITY PRESS
The Edinburgh Building, Cambridge CB2 2RU, UK

www.cambridge.org
Information on this title: www.cambridge.org/9780521835503

© S. McKillup 2005

This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without
the written permission of Cambridge University Press.

First published 2005

Printed in the United Kingdom at the University Press, Cambridge

A catalogue record for this publication is available from the British Library

ISBN-13 978-0-521-83550-X hardback

ISBN-10 0-521-83550-X hardback

ISBN-13 978-0-521-54316-9 paperback

ISBN-10 0-521-54316-9 paperback

Cambridge University Press has no responsibility for
the persistence or accuracy of URLs for external or
third-party internet websites referred to in this publication,
and does not guarantee that any content on such
websites is, or will remain, accurate or appropriate.


Contents

Prefacepage xi
   1Introduction1
1.1Why do life scientists need to know about experimental design and statistics?1
1.2What is this book designed to do?5
   2‘Doing science’ – hypotheses, experiments, and disproof7
2.1Introduction7
2.2Basic scientific method7
2.3Making a decision about an hypothesis10
2.4Why can’t an hypothesis or theory ever be proven?11
2.5‘Negative’ outcomes11
2.6Null and alternate hypotheses12
2.7Conclusion13
   3Collecting and displaying data14
3.1Introduction14
3.2Variables, experimental units, and types of data14
3.3Displaying data16
3.4Displaying ordinal or nominal scale data20
3.5Bivariate data23
3.6Multivariate data25
3.7Summary and conclusion26
   4Introductory concepts of experimental design27
4.1Introduction27
4.2Sampling – mensurative experiments28
4.3Manipulative experiments32
4.4Sometimes you can only do an unreplicated experiment39
4.5Realism40
4.6A bit of common sense41
4.7Designing a ‘good’ experiment41
4.8Conclusion42
   5Probability helps you make a decision about your results44
5.1Introduction44
5.2Statistical tests and significance levels45
5.3What has this got to do with making a decision or statistical testing?49
5.4Making the wrong decision49
5.5Other probability levels50
5.6How are probability values reported?51
5.7All statistical tests do the same basic thing52
5.8A very simple example – the chi-square test for goodness of fit52
5.9What if you get a statistic with a probability of exactly 0.05?55
5.10Statistical significance and biological significance55
5.11Summary and conclusion55
   6Working from samples – data, populations, and statistics57
6.1Using a sample to infer the characteristics of a population57
6.2Statistical tests57
6.3The normal distribution57
6.4Samples and populations63
6.5Your sample mean may not be an accurate estimate of the population mean65
6.6What do you do when you only have data from one sample?67
6.7Why are the statistics that describe the normal distribution so important?71
6.8Distributions that are not normal72
6.9Other distributions73
6.10Other statistics that describe a distribution74
6.11Conclusion75
   7Normal distributions – tests for comparing the means of one and two samples77
7.1Introduction77
7.2The 95% confidence interval and 95% confidence limits77
7.3Using the Z statistic to compare a sample mean and population mean when population statistics are known78
7.4Comparing a sample mean with an expected value81
7.5Comparing the means of two related samples88
7.6Comparing the means of two independent samples90
7.7Are your data appropriate for a t test?92
7.8Distinguishing between data that should be analysed by a paired sample test or a test for two independent samples94
7.9Conclusion95
   8Type 1 and Type 2 errors, power, and sample size96
8.1Introduction96
8.2Type 1 error96
8.3Type 2 error97
8.4The power of a test100
8.5What sample size do you need to ensure the risk of Type 2 error is not too high?102
8.6Type 1 error, Type 2 error, and the concept of biological risk104
8.7Conclusion104
   9Single factor analysis of variance105
9.1Introduction105
9.2Single factor analysis of variance106
9.3An arithmetic/pictorial example112
9.4Unequal sample sizes (unbalanced designs)117
9.5An ANOVA does not tell you which particular treatments appear to be from different populations117
9.6Fixed or random effects118
   10Multiple comparisons after ANOVA119
10.1Introduction119
10.2Multiple comparison tests after a Model I ANOVA119
10.3An a-posteriori Tukey comparison following a significant result for a single factor Model I ANOVA122
10.4Other a-posteriori multiple comparison tests123
10.5Planned comparisons124
   11Two factor analysis of variance127
11.1Introduction127
11.2What does a two factor ANOVA do?129
11.3How does a two factor ANOVA analyse these data?131
11.4How does a two factor ANOVA separate out the effects of each factor and interaction?136
11.5An example of a two factor analysis of variance139
11.6Some essential cautions and important complications140
11.7Unbalanced designs149
11.8More complex designs149
   12Important assumptions of analysis of variance: transformations and a test for equality of variances151
12.1Introduction151
12.2Homogeneity of variances151
12.3Normally distributed data152
12.4Independence155
12.5Transformations156
12.6Are transformations legitimate?158
12.7Tests for heteroscedasticity159
   13Two factor analysis of variance without replication, and nested analysis of variance162
13.1Introduction162
13.2Two factor ANOVA without replication162
13.3A-posteriori comparison of means after a two factor ANOVA without replication166
13.4Randomised blocks167
13.5Nested ANOVA as a special case of a one factor ANOVA168
13.6A pictorial explanation of a nested ANOVA170
13.7A final comment on ANOVA – this book is only an introduction175
   14Relationships between variables: linear correlation and linear regression176
14.1Introduction176
14.2Correlation contrasted with regression177
14.3Linear correlation177
14.4Calculation of the Pearson r statistic178
14.5Is the value of r statistically significant?184
14.6Assumptions of linear correlation184
14.7Summary and conclusion184
   15Simple linear regression186
15.1Introduction186
15.2Linear regression186
15.3Calculation of the slope of the regression line188
15.4Calculation of the intercept with the Y axis192
15.5Testing the significance of the slope and the intercept of the regression line193
15.6An example – mites that live in the your hair follicles199
15.7Predicting a value of Y from a value of X201
15.8Predicting a value of X from a value of Y201
15.9The danger of extrapolating beyond the range of data available202
15.10Assumptions of linear regression analysis202
15.11Further topics in regression204
   16Non-parametric statistics205
16.1Introduction205
16.2The danger of assuming normality when a population is grossly non-normal205
16.3The value of making a preliminary inspection of the data207
   17Non-parametric tests for nominal scale data208
17.1Introduction208
17.2Comparing observed and expected frequencies – the chi-square test for goodness of fit209
17.3Comparing proportions among two or more independent samples212
17.4Bias when there is one degree of freedom215
17.5Three-dimensional contingency tables219
17.6Inappropriate use of tests for goodness of fit and heterogeneity220
17.7Recommended tests for categorical data221
17.8Comparing proportions among two or more related samples of nominal scale data222
   18Non-parametric tests for ratio, interval, or ordinal scale data224
18.1Introduction224
18.2A non-parametric comparison between one sample and an expected distribution225
18.3Non-parametric comparisons between two independent samples227
18.4Non-parametric comparisons among more than two independent samples232
18.5Non-parametric comparisons of two related samples236
18.6Non-parametric comparisons among three or more related samples238
18.7Analysing ratio, interval, or ordinal data that show gross differences in variance among treatments and cannot be satisfactorily transformed241
18.8Non-parametric correlation analysis243
18.9Other non-parametric tests245
   19Choosing a test246
19.1Introduction246
   20Doing science responsibly and ethically255
20.1Introduction255
20.2Dealing fairly with other people’s work255
20.3Doing the experiment257
20.4Evaluating and reporting results258
20.5Quality control in science260
References261
Index263

© Cambridge University Press