Applied Statistics

The details
Mathematical Sciences
Colchester Campus
Undergraduate: Level 6
Monday 13 January 2020
Friday 20 March 2020
01 October 2019


Requisites for this module



Key module for

BSC 5B43 Statistics (Including Year Abroad),
BSC 9K12 Statistics,
BSC 9K13 Statistics (Including Placement Year),
BSC 9K18 Statistics (Including Foundation Year),
BSC I1G3 Data Science and Analytics,
BSC I1GB Data Science and Analytics (Including Placement Year),
BSC I1GC Data Science and Analytics (Including Year Abroad),
BSC I1GF Data Science and Analytics (Including Foundation Year)

Module description

This module covers three application areas of statistics: multivariate methods, demography and epidemiology and sampling.

Module aims


Multivariate methods
Vectors of expected values. Covariance and correlation matrices. Discriminant analysis, choice between two populations, calculation of discriminant function, and probability of misclassification, test and training samples, leave-one-out and k-fold cross-validation, idea of extension to several populations. Principal components; definition, interpretation of calculated components, use in regression. Cluster analysis, similarity measures, single-link and other hierarchical methods, k-means. Informal approaches to checking for multivariate Normality. Tests and confidence regions for multivariate means.

Demography and epidemiology
Population pyramids. Life tables. Standardised rates (e.g. mortality). Incidence and prevalence. Design and analysis of cohort (prospective) studies. Design and analysis of case-control (retrospective) studies. Confounding and interaction.
Matched case control design and analyses, using McNemar's test. Causation.
Relative risk. Odds ratio. Estimation and confidence intervals for 2x2 tables.
Mantel-Haenszel procedure. Sensitivity, specificity, ROC curves, positive predictive value, negative predictive value.

Census and sample survey design. Target and study populations, uses and limitations of non-probability sampling methods, sampling frames, sampling fraction.
Simple random sampling. Estimators of totals, means and proportions; bias. Estimated standard errors, confidence intervals and precision. Sampling fraction and finite population correction. Ratio and regression estimators. Stratified random sampling. Estimators of totals, means and proportions; bias. Estimated standard errors, confidence intervals and precision. Cost functions. Proportional and optimal allocations. Limitations of stratified sampling. One-stage cluster sampling. Estimators for totals, means and proportions with equal cluster sizes and with different cluster sizes. Estimated standard errors, confidence intervals and precision. Link with systematic sampling. Description of two-stage sampling and of multi-stage sampling. Limitations.

Module learning outcomes

On completion of the course students should be able to:
Understand and to apply multivariate methods;
Assess the results of discriminant analysis, principal components, cluster analysis and multivariate analysis of variance;
Understand and to apply demographical and epidemiological methods;
Understand and to apply sampling methods.

Module information

No additional information available.

Learning and teaching methods

The module consists of 20 lectures, 5 classes, 5 labs. 3 hours (2 lectures, 1 class or lab) per week in the spring term. In the summer term 3 revision lectures are given. A project is undertaken in groups. Coursework consists of problem sheets, a project report and presentation.


  • James, Gareth; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert. (2013) An introduction to statistical learning: with applications in R, New York: Springer. vol. Springer texts in statistics

The above list is indicative of the essential reading for the course. The library makes provision for all reading list items, with digital provision where possible, and these resources are shared between students. Further reading can be obtained from this module's reading list.

Assessment items, weightings and deadlines

Coursework / exam Description Deadline Weighting
Coursework   Initial Task    25% 
Coursework   Group Presentation and Group Project    75% 
Exam  1440 minutes during Summer (Main Period) (Main) 

Overall assessment

Coursework Exam
20% 80%


Coursework Exam
0% 100%
Module supervisor and teaching staff
Prof Berthold Lausen, email:
Professor Berthold Lausen (, Dr Fanlin Meng (, Dr Stella Hadjianto (
Professor Berthold Lausen (



External examiner

Prof Fionn Murtagh
University of Huddersfield
Professor of Data Science
Available via Moodle
Of 103 hours, 38 (36.9%) hours available to students:
65 hours not recorded due to service coverage or fault;
0 hours not recorded due to opt-out by lecturer(s).


Further information
Mathematical Sciences

Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements, industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules. The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.

The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.