Data Science Intermediate

Regression models
pre-recorded lectures available from 11th October 2022

Paola Zaninotto, Giorgio Di Gessa, Andrea Aparicio Castro and Meredith Martyn

This self-paced online course will give you an overview of commonly used regression methods to examine the relationship between an outcome of interest and an explanatory variable. You will be introduced to classical linear regression and generalised linear models (e.g. logistic, Poisson, ordinal/multinomial models) depending on the distribution of the outcome. The course covers the basic concepts, formulation, interpretation, and validation of the models. Real-world data will be used to demonstrate the practical applications of these models. You will find all the required materials on this webpage.

Participants will need to have a prior understanding of basic statistical concepts (i.e. descriptive statistics, mean, standard deviation, confidence intervals etc.), quantitative data structures and types of variables, as well as being familiar with the software they intend to use (either Stata or R).

  • Selection of models depending on the outcome of interest
  • Understanding the principles and assumptions behind each model
  • Practice regression methods to determine the relationship between an outcome and one or more explanatory variable

By the end of this course you will be able to:

  • Have a basic knowledge and understand different regression models, and when they are applicable for your research question;
  • Specify and perform regression analyses;
  • Propose, select and evaluate models;
  • Recognize confounding in statistical analysis;
  • Interpret and communicate results.

There are four pre-recorded lectures, each accompanied by a computer practical task in R and Stata with solutions.

Lecture 1

Linear Regression for continuous outcomes

Lecture 2

Logistic Regression for binary outcomes

Lecture 3

Poisson Regression for count data

Lecture 4

Ordinal/Multinomial Regression for categorical outcomes

An understanding of basic statistical concepts (i.e. descriptive statistics mean standard deviation confidence intervals etc), quantitative data structures and types of variables.