Learn R for Applied Statistics, Eric ============================== About the Author␈ix About the Technical Reviewer␈xi Acknowledgments␈xiii Introduction␈xv Chapter 1: ␇Introduction␈1 ------------------------- What Is R?␈1 High-Level and Low-Level Languages␈2 What Is Statistics?␈3 What Is Data Science?␈4 What Is Data Mining?␈6 Business Understanding␈8 Data Understanding␈8 Data Preparation␈8 Modeling␈9 Evaluation␈9 Deployment␈9 What Is Text Mining?␈9 Data Acquisition␈10 Text Preprocessing␈10 Modeling␈11 Evaluation/Validation␈11 Applications␈11 Table of Contents ----------------- Natural Language Processing␈11 Three Types of Analytics␈12 Descriptive Analytics␈12 Predictive Analytics␈13 Prescriptive Analytics␈13 Big Data␈13 Volume␈13 Velocity␈14 Variety␈14 Why R?␈15 Conclusion␈16 References␈18 Chapter 2: ␇Getting Started␈19 ----------------------------- What Is R?␈19 The Integrated Development Environment␈20 RStudio: The IDE for R␈22 Installation of R and RStudio␈22 Writing Scripts in R and RStudio␈30 Conclusion␈36 References␈37 Chapter 3: ␇Basic Syntax␈39 -------------------------- Writing in R Console␈39 Using the Code Editor␈42 Adding Comments to the Code␈46 Variables␈47 Data Types␈48 Vectors␈50 Lists␈53 Matrix␈58 Data Frame␈63 Logical Statements␈67 Loops␈69 For Loop␈69 While Loop␈71 Break and Next Keywords␈72 Repeat Loop␈74 Functions␈75 Create Your Own Calculator␈80 Conclusion␈83 References␈84 Chapter 4: ␇Descriptive Statistics␈87 -------------------------------------- What Is Descriptive Statistics?␈87 Reading Data Files␈88 Reading a CSV File␈89 Writing a CSV File␈91 Reading an Excel File␈92 Writing an Excel File␈93 Reading an SPSS File␈94 Writing an SPSS File␈96 Reading a JSON File␈96 Basic Data Processing␈97 Selecting Data␈97 Sorting␈99 Filtering␈101 Removing Missing Values␈102 Removing Duplicates␈103 Some Basic Statistics Terms␈104 Types of Data␈104 Mode, Median, Mean␈105 Interquartile Range, Variance, Standard Deviation␈110 Normal Distribution␈115 Binomial Distribution␈121 Conclusion␈124 References␈125 Chapter 5: ␇Data Visualizations␈129 ----------------------------------- What Are Data Visualizations?␈129 Bar Chart and Histogram␈130 Line Chart and Pie Chart␈137 Scatterplot and Boxplot␈142 Scatterplot Matrix␈146 Social Network Analysis Graph Basics␈147 Using ggplot2␈150 What Is the Grammar of Graphics?␈151 The Setup for ggplot2␈151 Aesthetic Mapping in ggplot2␈152 Geometry in ggplot2␈152 Labels in ggplot2␈155 Themes in ggplot2␈156 ggplot2 Common Charts␈158 Bar Chart␈158 Histogram␈160 Density Plot␈161 Scatterplot␈161 Line chart␈162 Boxplot␈163 Interactive Charts with Plotly and ggplot2␈166 Conclusion␈169 References␈170 Chapter 6: ␇Inferential Statistics and Regressions␈173 ---------------------------------------------------- What Are Inferential Statistics and Regressions?␈173 apply(), lapply(), sapply()␈175 Sampling␈178 Simple Random Sampling␈178 Stratified Sampling␈179 Cluster Sampling␈179 Correlations␈183 Covariance␈185 Hypothesis Testing and P-Value␈186 T-Test␈187 Types of T-Tests␈187 Assumptions of T-Tests␈188 Type I and Type II Errors␈188 One-Sample T-Test␈188 Two-Sample Independent T-Test␈190 Two-Sample Dependent T-Test␈193 Chi-Square Test␈194 Goodness of Fit Test␈194 Contingency Test␈196 ANOVA␈198 Grand Mean␈198 Hypothesis␈198 Assumptions␈199 Between Group Variability␈199 Within Group Variability␈201 One-Way ANOVA␈202 Two-Way ANOVA␈204 MANOVA␈206 Nonparametric Test␈209 Wilcoxon Signed Rank Test␈209 Wilcoxon-Mann-Whitney Test␈213 Kruskal-Wallis Test␈216 Linear Regressions␈218 Multiple Linear Regressions␈223 Conclusion␈229 References␈231 ␇Index␈237