Modelling Experimental Data

The details
Mathematical Sciences
Colchester Campus
Postgraduate: Level 7
Sunday 17 January 2021
Friday 26 March 2021
16 July 2020


Requisites for this module



Key module for

MSC G30412 Data Science,
MSC G30424 Data Science,
MSC G304PP Data Science with Professional Placement,
DIP G30009 Statistics,
MSC G30012 Statistics,
MSC G306JS Data Science and its Applications,
MPHDG30048 Statistics,
PHD G30048 Statistics,
MPHDG30448 Data Science,
PHD G30448 Data Science,
MSCIN399 Actuarial Science and Data Science,
MSCIG199 Mathematics and Data Science

Module description

This module is concerned with the application of linear models to the analysis of data. The underlying assumptions are discussed and general results are obtained using matrices. The standard approach to the analysis of normally distributed data using ANOVA is introduced. Methods for the design and analysis of efficient experiments are introduced. The general methodology is extended to logistic regression and the analysis of multidimensional contingency tables.

Module aims

The aim of this module is to provide the essential foundations of linear models by studying important topics of statistical modelling. This is achieved by an in-depth study of the main methods to analyse experimental data.

Module learning outcomes

On completion of the module students should be able to:

- calculate confidence intervals for parameters and prediction intervals for future observations;
- understand how to represent a linear model in matrix form;
- check model assumptions and identify influential observations;
- identify simple designed experiments;
- construct factorial experiments in blocks;
- adapt linear models to fit growth curves;
- carry out logistic regression;
- analyze cross-tabulated data using log linear models;
- analyse linear models using R.

Module information


Simple linear regression

1. Link between maximum likelihood and least Squares. OLS for linear regression.
2. Pythagoras and the ANOVA table. The estimation of $rc2 .
3. Confidence intervals for parameters and prediction intervals for future observations

General results using matrices

4. Matrix formulation. Normal equations. Solution. Moments of estimators.
5. Gauss-Markov theorem. Estimability
6. H, Q, V.
7. Generalised and weighted least squares.

Multiple regression

8. Multiple regression. Subdividing the regression sum of squares. Lack of fit and pure error.
9. Regression diagnostics. Leverage, Residual plots. Multicollinearity, Serial correlation
10. Model selection. Stepwise methods. Cp plots.
11. Curvilinear regression. Orthogonal polynomials.

Designed experiments

13. Completely randomised experiment. Replication. ANOVA. Contrasts.
14. Randomized blocks. Latin squares. Multiple comparison tests.
15. ANOVA with random effects
16. Balanced incomplete blocks. ANOVA (relation to bivariate regression)
17. Factorial experiments: notation. ANOVA. Model selection.
18. Factorials and blocks: confounding and partial confounding.
19. Fractional replicates. Aliases.

Non-linear models

20. The Newton-Raphson procedure. Application to growth curves.
21. Estimation, confidence intervals, tests.

Logit and loglinear models

22. Logistic regression
23. Loglinear models. Birch's result. Hierarchy principle. Iterative proportional fitting.
24. Independence. Conditional independence. Multistage analysis.
25. Simpson's paradox. Incomplete tables. Square tables.

Learning and teaching methods

Teaching will be delivered in a way that blends face-to-face classes, for those students that can be present on campus, with a range of online lectures, teaching, learning and collaborative support.


  • Faraway, Julian James. (©2015) Linear models with R, Boca Raton, FL: CRC Press. vol. Chapman & Hall/CRC texts in statistical science series

The above list is indicative of the essential reading for the course. The library makes provision for all reading list items, with digital provision where possible, and these resources are shared between students. Further reading can be obtained from this module's reading list.

Assessment items, weightings and deadlines

Coursework / exam Description Deadline Weighting
Coursework   Assignment 1  19/02/2021   
Coursework   Assignment 2  26/03/2021   
Exam  240 minutes during Summer (Main Period) (Main) 

Overall assessment

Coursework Exam
20% 80%


Coursework Exam
20% 80%
Module supervisor and teaching staff
Dr Stella Hadjiantoni, email:
Dr Stella Hadjiantoni & Dr Joseph Bailey
Dr Stella Hadjiantoni (, Dr Joseph Bailey (



External examiner

Prof Fionn Murtagh
University of Huddersfield
Professor of Data Science
Available via Moodle
Of 3158 hours, 0 (0%) hours available to students:
3158 hours not recorded due to service coverage or fault;
0 hours not recorded due to opt-out by lecturer(s).


Further information
Mathematical Sciences

Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements, industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules. The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.

The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.