Data Science Intermediate

Regression models
13 – 16 FEBRUARY 2023
pre-recorded lectures available from 11th October 2022

Paola Zaninotto, Giorgio Di Gessa, Andrea Aparicio-Castro and Meredith Martyn

This online course gives you an overview of commonly used regression methods to examine the relationship between an outcome of interest and an explanatory variable. You will be introduced to classical linear regression and generalised linear models (e.g. logistic, Poisson, ordinal/multinomial models) depending on the distribution of the outcome. The course covers the basic concept, formulation, interpretation, and validation of the models. Real-world data will be used to demonstrate the practical applications of these models.

  • Selection of models depending on the outcome of interest
  • Understanding the principles and assumptions behind each model
  • Practice regression methods to determine the relationship between an outcome and one or more explanatory variable

By the end of this course you will be able to:

  • Have a basic knowledge and understand different regression models, and when they are applicable for your research question;
  • Specify and perform regression analyses;
  • Propose, select and evaluate models;
  • Recognize confounding in statistical analysis;
  • Interpret and communicate results.

There will be four sessions, each consisting of a pre-recorded lecture (length varies) and a 1.5-hour live computer practical session in R or Stata.

Lecture 1

Linear Regression for continuous outcomes

Lecture 2

Logistic Regression for binary outcomes

Lecture 3

Poisson Regression for count data

Lecture 4

Ordinal/Multinomial Regression for categorical outcomes



Monday 13th February


Lecture 1 followed by

Practical 1 in either Stata or R

Tuesday 14th February


Lecture 2 followed by

Practical 2 in either Stata or R

Wednesday 15th February


Lecture 3 followed by

Practical 3 in either Stata or R

Thursday 16th February


Lecture 4 followed by

Practical 4 in either Stata or R

An understanding of basic statistical concepts (i.e. descriptive statistics mean standard deviation confidence intervals etc), quantitative data structures and types of variables.


This is a UKRI funded project offering rigorous training in longitudinal data science. Please note that this training is NOT available to undergraduate or masters students.