Introductory Statistics with R, 2E, Peter
1 Basics
1.1 First steps . 1.1.1 An overgrown calculator . 1.1.2 Assignments . 1.1.3 Vectorized arithmetic . 1.1.4 Standard procedures . 1.1.5 Graphics .
1.2 R language essentials . 1.2.1 Expressions and objects . 1.2.2 Functions and arguments . 1.2.3 Vectors . 1.2.4 Quoting and escape sequences . 1.2.5 Missing values . 1.2.6 Functions that create vectors . 1.2.7 Matrices and arrays . 1.2.8 Factors . 1.2.9 Lists . 1.2.10 Data frames . 1.2.11 Indexing . 1.2.12 Conditional selection . 1.2.13 Indexing of data frames . 1.2.14 Grouped data and data frames .xii 1.2.15 Implicit loops . 1.2.16 Sorting . 1.3 Exercises .26
2 The R environment
2.1 Session management . 2.1.1 The workspace . 2.1.2 Textual output . 2.1.3 Scripting . 2.1.4 Getting help . 2.1.5 Packages . 2.1.6 Built-in data . 2.1.7 attach and detach . 2.1.8 subset, transform, and within .
2.2 The graphics subsystem . 2.2.1 Plot layout . 2.2.2 Building a plot from pieces . 2.2.3 Using par . 2.2.4 Combining plots .
2.3 R programming . 2.3.1 Flow control . 2.3.2 Classes and generic functions .
2.4 Data entry . 2.4.1 Reading from a text file . 2.4.2 Further details on read.table . 2.4.3 The data editor . 2.4.4 Interfacing to other programs .
2.5 Exercises .31
3 Probability and distributions
3.1 Random sampling . 3.2 Probability calculations and combinatorics . 3.3 Discrete distributions . 3.4 Continuous distributions .
3.5 The built-in distributions in R . 3.5.1 Densities . 3.5.2 Cumulative distribution functions . 3.5.3 Quantiles . 3.5.4 Random numbers .
3.6 Exercises .55
4 Descriptive statistics and graphics
4.1 Summary statistics for a single group .
4.2 Graphical display of distributions . 4.2.1 Histograms .67 4.2.2 Empirical cumulative distribution . 4.2.3 Q–Q plots . 4.2.4 Boxplots .
4.3 Summary statistics by groups . 4.4 Graphics for grouped data . 4.4.1 Histograms . 4.4.2 Parallel boxplots . 4.4.3 Stripcharts .
4.5 Tables . 4.5.1 Generating tables . 4.5.2 Marginal tables and relative frequency .
4.6 Graphical display of tables . 4.6.1 Barplots . 4.6.2 Dotcharts . 4.6.3 Piecharts . Exercises .73
5 One- and two-sample tests 5.1 One-sample t test . 5.2 Wilcoxon signed-rank test . 5.3 Two-sample t test . 5.4 Comparison of variances . 5.5 Two-sample Wilcoxon test . 5.6 The paired t test . 5.7 The matched-pairs Wilcoxon test . 5.8 Exercises .95
6 Regression and correlation 6.1 Simple linear regression . 6.2 Residuals and fitted values . 6.3 Prediction and confidence bands .
6.4 Correlation . 6.4.1 Pearson correlation . 6.4.2 Spearman’s ρ . 6.4.3 Kendall’s τ . 6.5 Exercises .109
7 Analysis of variance and the Kruskal–Wallis test 7.1 One-way analysis of variance . 7.1.1 Pairwise comparisons and multiple testing . 7.1.2 Relaxing the variance assumption . 7.1.3 Graphical presentation . 7.1.4 Bartlett’s test .
7.2 Kruskal–Wallis test .
7.3 Two-way analysis of variance .127 7.3.1 Graphics for repeated measurements . 7.4 The Friedman test . 7.5 The ANOVA table in regression analysis . 7.6 Exercises .140
8 Tabular data 8.1 Single proportions . 8.2 Two independent proportions . 8.3 k proportions, test for trend . 8.4 r × c tables . 8.5 Exercises .145
9 Power and the computation of sample size 9.1 The principles of power calculations . 9.1.1 Power of one-sample and paired t tests . 9.1.2 Power of two-sample t test . 9.1.3 Approximate methods . 9.1.4 Power of comparisons of proportions .
9.2 Two-sample problems . 9.3 One-sample problems and paired tests . 9.4 Comparison of proportions . 9.5 Exercises .155
10 Advanced data handling
10.1 Recoding variables . 10.1.1 The cut function . 10.1.2 Manipulating factor levels . 10.1.3 Working with dates . 10.1.4 Recoding multiple variables .
10.2 Conditional calculations .
10.3 Combining and restructuring data frames . 10.3.1 Appending frames . 10.3.2 Merging data frames . 10.3.3 Reshaping data frames .
10.4 Per-group and per-case procedures . 10.5 Time splitting . 10.6 Exercises .163
11 Multiple regression
11.1 Plotting multivariate data . 11.2 Model specification and output . 11.3 Model search . 11.4 Exercises .185
12 Linear models
12.1 Polynomial regression . 12.2 Regression through the origin . 12.3 Design matrices and dummy variables . 12.4 Linearity over groups . 12.5 Interactions . 12.6 Two-way ANOVA with replication .
12.7 Analysis of covariance . 12.7.1 Graphical description . 12.7.2 Comparison of regression lines . 12.8 Diagnostics . 12.9 Exercises .195
13 Logistic regression 13.1 Generalized linear models .
13.2 Logistic regression on tabular data . 13.2.1 The analysis of deviance table . 13.2.2 Connection to test for trend .
13.3 Likelihood profiling . 13.4 Presentation as odds-ratio estimates . 13.5 Logistic regression using raw data . 13.6 Prediction . 13.7 Model checking . 13.8 Exercises .227
14 Survival analysis
14.1 Essential concepts . 14.2 Survival objects . 14.3 Kaplan–Meier estimates . 14.4 The log-rank test . 14.5 The Cox proportional hazards model . 14.6 Exercises .249
15 Rates and Poisson regression
15.1 Basic ideas . 15.1.1 The Poisson distribution . 15.1.2 Survival analysis with constant hazard . 15.2 Fitting Poisson models . 15.3 Computing rates . 15.4 Models with piecewise constant intensities . 15.5 Exercises .259
16 Nonlinear curve fitting
16.1 Basic usage . 16.2 Finding starting values .275 16.3 Self-starting models . 16.4 Profiling . 16.5 Finer control of the fitting algorithm . 16.6 Exercises .
A Obtaining and installing R and the ISwR package289 B Data sets in the ISwR package293 C Compendium325 D Answers to exercises337 Bibliography355 Index3571