BIOSTAT 620: Introduction to Health Data Science
Dylan Cable
Canvas for announcements and grading:
Official class website, containing syllabus, reading materials, slides, labs, and assignments:
This course is a introduction to the world of data science with a focus on application in the health sciences.
The course will teach data science skills that are easily transferable, with examples done in R.
In this class, we will be using R and RStudio.
This is not a formal statistics class. You will not be expected to know or use:
Data does not exist in a vacuum. In order to gain new insights from data, you must start with a baseline understanding of the subject. “Domain knowledge” or “subject matter expertise” is critical, but it is not the purpose of this class.
This course will focus on applications in Public Health, but the skills you learn will be widely transferable.
Before computers had graphics and mice, there were only text-based interfaces, called command lines, that let you interact with the directories and files on the computer.
The modern “Desktop” is actually just a directory on your computer!
/Users/<username>/DesktopC:\Users\<username>\DesktopThe route from the root directory to any specific file or directory is called the “path”.
Whenever you run a program on your computer, you are running it in a specific location (directory). If you want to access another file on your computer, you’ll need to know the path to that file. Paths can be either relative or absolute.
How to get from my Desktop directory to my Documents directory via:
/Users/dmcable/Desktop../Desktop/Special symbols:
. Current directory.. Parent directory (one step up the hierarchy)~ Home directoryWe won’t have to use the command line too much in this class, but understanding file paths will be very important!
R is a language and environment for statistical computing and graphics: https://r-project.org
Created by statisticians for statisticians.
Over 16,000 packages added to CRAN
RStudio is an integrated development environment (IDE) for R: https://www.rstudio.com/products/rstudio/
https://forms.gle/rbPyera5dDvxwzPW7
We will run our first Lab (1)
The lab exercises can be found on the Schedule page of the course website: