Skip to content

Applied Machine Learning for Health Data

Course Number
CHL5230H
Series
5200 (Biostatistics)
Format
Lecture
Course Instructor(s)
Nikolaos Mitsakakis

Course Description

Data science is a recently emerged field increasingly used in various areas and applications in industry, academia and government. Health Data Science refers to the application of Data Science methods and principles to large real-world complex data and problems in health. Some examples include health administrative data, electronic health records, clinical registries, while these can be also linked with Patient Recorded Outcomes, Genomic Data, Lab data among others. This course will provide an introduction to data science and how it can be useful for applications in population health and public health outcomes. The focus will be on Data Science analytics methods, such as applied machine and statistical learning, using the R statistical software system. Some theoretical background will be presented but the focus will be on hands-on practical application using large health data.

Course Objectives

By the end of the course students will be able to:

  • understand what we mean by machine learning and data science
  • understand the different types of machine learning based on the way they work and
    the tasks they accomplish
  • perform simple operations and data analysis using R
  • fit simple machine learning models to data, obtain and interpret the results
  • determine the appropriate type of machine learning methodology to be used for an
    applied problem in the health services research
  • up to some degree critically appraise the appropriateness of the use of machine
    learning methodology in published research

Methods of Assessment

Assignments (3 @ 20% each) 60%
Final Exam 30%
Participation 10%

General Requirements

Students should have taken a graduate level course in statistics, be familiar with basic concepts of statistics and probability and have a good understanding of regression. Previous experience with writing scripts for data analysis (using R, SAS or other similar software) or any other programming experience is preferred but not necessary.