Data Science Intermediate

Longitudinal Data Preparation and Visualisation for Epidemiological and Social Research

Giorgio Di Gessa

This self-paced online course is for anyone who needs to prepare longitudinal data for analysis. It will cover the main procedures needed from converting raw longitudinal data to cleaned data that can be readily analysed.

The course consists of two sessions, one covering data preparation and the other covering data description and visualisation. Both focus on longitudinal data and real-world data. The course is in Stata; it has videos and practical exercises.

  • Importing and merging data
  • Selecting cases and variables
  • Reshaping data
  • Recoding variables (using loops)
  • Describing data using summary statistics
  • Creating transition tables
  • Using graphs for exploratory analysis

By the end of this course you will be able to:

  • Understand how complex large-scale datasets are structured
  • Prepare, using syntax files, complex datasets for appropriate statistical analysis by:
    • Combining multiple datasets, and aggregating/disaggregating data from different files in a relational database
    • Manipulating, recoding, and computing derived variables
  • Provide descriptive summary statistics and graphical representations of the data

There are two pre-recorded lectures, each accompanied by a computer practical task in Stata with solutions.

An understanding of quantitative data structures and types of variables.