Lecture Notes for AU DATA 413-613 Data Science


Richard Ressler [<rressler@american.edu >]


May 17, 2024


data science, DATA-413, DATA-613, R Progamming


These lecture notes are a work in progress for the American University course DATA-413-613 Data Science. The course builds on a student’s previous courses in statistics and programming with R to examine topics across the data science life cycle in more depth.

The notes are based on the R Statistical programming language Team (2018) and the {tidyverse} package Wickham et al. (2019). The notes use many other R packages as well as python and several python libraries.

  • See Appendix D — Environment and R Packages for a more detailed description of the environment and the packages with their versions.
  • This work was produced using Quarto Posit (2022) from Posit. You can choose to read the notes in “dark mode” by using the toggle under the Title in the left margin.

Please send any corrections or recommendations to rressler@american.edu.


This work is Copyright Richard Ressler 2024.

This work is licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).


Portions of these notes incorporate and build on works by others as noted in the appropriate sections and the references.

Sections of these notes are based on excellent work by Professor David Gerard of American University. His work is licensed under Creative Commons CC BY-NC 4.0 and visible at his DATA 413/613 Teaching Website.

However, any errors are my responsibility.