Introduction

2025-01-09

General Information

  • BIOSTAT 620 Introduction to Health Data Science
  • Instructor: Dylan Cable
  • GSI: Yize Hao
  • For each module, we will have lectures, followed by a problem set.
  • Time permitting, we will work on problem sets together in class, as a lab.
  • As such, bring your own computer to class, and come prepared with having set up the problem set.
  • Class attendance is highly recommended.

Course Description

Lecture notes: https://dmcable.github.io/BIOSTAT620/

Please read the syllabus!

Important details

  • Complete readings before class.
  • Midterms are in person. There are no makeups.
  • Make sure you read messages sent via Canvas
  • You can select your own final project, but need approval.
  • You should start final project by February 27.
  • Help us pick office hours: we will send out a survey!

What’s coming

  • UNIX/Linux shell.
  • Reproducible document preparation
  • Version control with git and GitHub
  • R programming
  • Data wrangling with dplyr and data.table
  • Data visualization with ggplot2
  • Probability theory, inference and modeling
  • High-dimensional data techniques
  • Machine learning

Let’s get started

  • Install R.
  • Install RStudio.
  • Make sure you have access to a terminal.