Syllabus
General Information
- BIOSTAT 620 Introduction to Health Data Science W25
- SPH I 1655
- Tue & Thu 8:30 - 10:00am
- Lecture notes: https://dmcable.github.io/BIOSTAT620/
- Canvas: https://umich.instructure.com/courses/727331
- Instructor: Dylan Cable [dmcable@umich.edu]
- Graduate student instructor: Yize Hao [yizehao@umich.edu]
Prerequisites
We assume students have taken or are taking a probability and statistics course and have basic programming skills.
Textbooks
Course Description
This course introduces the following:
- UNIX/Linux shell
- Reproducible document preparation with RStudio, knitr, and markdown
- Version control with git and GitHub
- R programming
- Data wrangling with dplyr and data.table
- Data visualization with ggplot2
We also demonstrate how the following concepts are applied in data analysis:
- Probability theory
- Statistical inference and modeling
- High-dimensional data techniques
- Machine learning
We do not cover the theory and details of these methods as they are covered in other courses. As an applied course, the goal is to learn the art and science of applying these methods in practice.
Throughout the course, we use motivating case studies and data analysis problem sets based on challenges similar to those you encounter in scientific research.
Course Structure
For each (approximately weekly) module, we will first teach in lecture the concepts, methods, and skills needed for problem sets. Then, we will start working together on problem sets in in-class labs. It is very important to attend class for this learning experience. On Sundays, the problem sets will be due (see Key Dates and Problem Sets).
Please ensure that you read the chapters listed in the syllabus before each lecture. The lectures are designed with the assumption that you have completed the readings, enabling us to dive deeper into the nuances of data analysis and coding.
Lectures will not be recorded.
Grade Distribution
Component | Weight |
---|---|
10 problem sets | 50% |
Midterm 1 | 15% |
Midterm 2 | 15% |
Final project | 20% |
Problem Sets
Problem sets will be due every week or every other week, depending on difficulty. They will be due at 11:59 PM on the day denoted on the Problem Sets page.
Some problem sets include open ended questions that will be difficult to answer on your own. We will be working on these together during labs. We also offer office hours where you can get help with unanswered questions. Students are encouraged to discuss problem sets and ideas with each other, but the final submission should be your own work.
Problem sets must be submitted via GitHub. Students are required to have a GitHub account and create a repository for the course. We will be providing further instructions during the first lab.
10% of the total points for the problem sets will be deducted for every late day. Students can have a total of 4 late days without penalty during the entire semester. No need to provide a written excuse. Providing an excuse does not give you more days unless an accommodation is requested and approved by the Office of Student Affairs.
Problem set submissions need to be completely reproducible Quarto documents. If your Quarto file does not compile it will be considered a late day, and you will be notified and will need to resubmit a Quarto file that does compile. You will be deducted further late days for every day it takes for you to turn in a Quarto file that does knit. You are required to check emails that come through the Canvas system, as this the only way we will communicate problems with your problem sets.
Midterm Policy
Both midterms are closed book, no internet, and in-class. You are expected to complete them in 1 hour. A one-page, double-sided, handwritten formula sheet is allowed.
Questions will be drawn mostly or entirely from the problem sets.
Please make sure you can come to class on the midterm dates provided in the Key Dates table below. If you miss the exam, you will need approval from the Office of Student Affairs to receive a makeup. All make-up exams will be completely different from the in-class ones.
Final Project
For your final project we ask that you turn in a 4-6 page report using data to answer a public health related question. You can chose from one of the following:
- Based on state-level data, how effective where vaccines against SARS-CoV-2 reported cases and COVID-19 hospitalizations and deaths, and vaccination rates.
- What was the excess mortality after Hurricane María in Puerto Rico? Where different age groups affected differently?
The final project will contain a presentation component.
Optionally, you can select a question that align with your ongoing research. This way, it can be directly beneficial to your work. This will require prior approval from the instructor by February 27th.
Note: You should start working on your project after the first midterm. Do not wait until the last week. Teaching staff will be available during office hours.
AI tools policy
You can use ChatGPT and other AI tools to assist with questions you may have about code etc. Ultimately, any answers and code that you submit on assignments must be prepared organically by you. Do remember you won’t be able to use these tools during the midterms.
Key Dates
Date | Event |
---|---|
Jan 19 | Pset 1 due |
Jan 26 | Pset 2 due |
Feb 2 | Pset 3 due |
Feb 9 | Pset 4 due |
Feb 18 | Midterm 1: covers material from Jan 09-Feb 17 |
Feb 23 | Pset 5 due |
Feb 27 | Project proposals due. Obtain approval if you want to do a personal project instead. |
Mar 02 | Pset 6 due |
Mar 04 | No class: Spring break |
Mar 06 | No class: Spring break |
Mar 16 | Pset 7 due |
Mar 25 | Pset 8 due |
Apr 01 | Midterm 2: cover material from Jan 09-Mar 27 |
Apr 17 | Final project presentations day 1 |
Apr 22 | Final project presentations day 2 |
Apr 22 | Last day of class |
Apr 24 | Final Project due |
Course Goals
After taking this class, students are expected to have a practical understanding of important statistical issues for health data analysis in different areas, along with basic software and programming skills for data cleaning, data sharing, code sharing, exploratory data analysis, results visualization, as well as making the analyses sharable and reproducible.
Competencies
This is the list of competencies covered by Biostatistics 620:
- Understand the roles and principles biostatistics serves in the analysis of public health data.
- Apply preferred methodological alternatives to commonly used statistical methods when assumptions are not met.
- Distinguish among the different measurement scales and data quality, as well as their implications for selection of statistical methods to be used based on these distinctions.
- Apply quantitative techniques commonly used to summarize and display big public health data.
- Apply descriptive and inferential methodologies according to the type of study design or sampling technique for answering a particular public health question.
- Apply basic informatics and computational techniques in the analysis of big health data, and interpret results of statistical analyses.
Classroom Expectations/Etiquette
Students in BIOSTAT 620 are expected to attend lectures in person. Students are expected to bring a computer to class, as some in-class exercises will require the use of a computer. Some in-class exercises may be done in groups, so students are expected to respect and allow for the opinions of all members of their groups, and all members of the entire class. Much of the learning in class will be done through active participation, so students should be prepared to speak during class or be called upon to provide answers to questions.
Academic Integrity
The faculty and staff of the School of Public Health believe that the conduct of a student registered or taking courses in the School should be consistent with that of a professional person. Courtesy, honesty, and respect should be shown by students toward faculty members, guest lecturers, administrative support staff, community partners, and fellow students. Similarly, students should expect faculty to treat them fairly, showing respect for their ideas and opinions and striving to help them achieve maximum benefits from their experience in the School. Student academic misconduct refers to behavior that may include plagiarism, cheating, fabrication, falsification of records or official documents, intentional misuse of equipment or materials (including library materials),and aiding and abetting the perpetration of such acts. Please visit Policies and Procedures for MPH & MHSA Students for the full Policy on Student Academic Conduct Standards and Procedures.
SPH Writing Lab
The SPH Writing Lab is located in 5025 SPH II and offers writing support to all SPH students for course papers, manuscripts, grant proposals, dissertations, personal statements, and all other academic writing tasks. The Lab can also help answer questions on academic integrity. To learn more or make an appointment, please visit the SPH writing lab website.
Student Well-Being
SPH faculty and staff believe it is important to support the physical and emotional well-being of our students. If you have a physical or mental health issue that is affecting your performance or participation in any course, and/or if you need help connecting with University services, please contact the instructor or the Office for Student Affairs. Please visit our Wellness Resources page for information on wellness resources available to you.
Student Accommodations
Students should speak with their instructors before or during the first week of classes regarding any special needs. Students can also visit the Office for Student Affairs for assistance in coordinating communications around accommodations.
Students seeking academic accommodations should register with Services for Students with Disabilities (SSD). SSD arranges reasonable and appropriate academic accommodations for students with disabilities. Please visit the Services for Students with Disabilities website for more information on student accommodations.
Students who expect to miss classes, examinations, or other assignments as a consequence of their religious observance shall be provided with a reasonable alternative opportunity to complete such academic responsibilities. It is the obligation of students to provide faculty with reasonable notice of the dates of religious holidays on which they will be absent. Please visit the Office of the Provost website for the complete University policy.
Students who are feeling ill should not come to class in-person. Your grade will not be negatively impacted by not attending class due to illness. If you need flexibility due to being ill, having caregiving responsibilities, childcare responsibilities, etc., please email me as soon as possible so that we can come up with a plan to allow flexibility in assignment due dates.