Data analysis and statistics with R
Mathematics, Statistics and Actuarial Science (School of)
Postgraduate: Level 7
Monday 18 January 2021
Friday 26 March 2021
10 September 2020
Requisites for this module
MSC G305JS Applied Data Science
The module will introduce concepts from data analysis and statistics and show how they can be applied effectively via the R language. It will cover a wide introduction to statistics and provide practical experience of real-world examples of how statistics is used to gain insights.
Throughout these examples, and many more, we will teach programming techniques that will enable students to apply statistical approaches to real-world applications.
This module assumes no previous exposure to statistics.
The purpose of this module is to introduce:
The use of R for data analysis and statistics.
A. A systematic, extensive and comparative knowledge and understanding of the use of R for carrying out statistical analysis
B. A systematic, extensive and comparative knowledge and understanding of data analysis methods
C. A systematic, extensive and comparative knowledge and understanding of statistical methods
Basic ideas of probability and statistical distributions
Random variables, means, covariance and variance
Variance of a sample mean and confidence intervals for means, variances and differences between means
Conditional probability and independence
Probability distribution theory
Standard distributions and their use in modelling: including Bernoulli, Binomial, Poisson
Estimation and Maximum Likelihood estimators
Hypothesis tests concerning means and variances
Null and alternative hypotheses
Type I and type II errors
Test statistic and critical region
Probability value and level of significance
Using tables of the t, F and chi-squared distributions
Introduction to linear regression
The least square estimates of the intercept and the slope of a simple linear regression
Confidence intervals for the slope parameter and prediction intervals for response
Coefficient of determination and the sample correlation coefficient
This module has 35 contact hours that will be structured as follows:
Lectures: 20 hours
Classes: 10 hours
Computer labs: 5 hours
- Upton, Graham J. G; Cook, Ian. (2001) Introducing statistics, Oxford: Oxford University Press.
- Rowntree, Derek. (2018) Statistics without tears: a primer for non-mathematicians, London: Penguin Books.
- Wheelan, Charles J. (c2013) Naked statistics: stripping the dread from the data, New York: W.W. Norton.
- Spiegelhalter, D. J. (2019) The art of statistics: learning from data, [London] UK: Pelican, an imprint of Penguin Books.
- Wickham, Hadley; Grolemund, Garrett. (2016-12-12) R for Data Science: O'Reilly Media.
The above list is indicative of the essential reading for the course. The library makes provision for all reading list items, with digital provision where possible, and these resources are shared between students. Further reading can be obtained from this module's reading list.
Assessment items, weightings and deadlines
|Coursework / exam
|Final Project & Presentation
Exam format definitions
- Remote, open book: Your exam will take place remotely via an online learning platform. You may refer to any physical or electronic materials during the exam.
- In-person, open book: Your exam will take place on campus under invigilation. You may refer to any physical materials such as paper study notes or a textbook during the exam. Electronic devices may not be used in the exam.
- In-person, open book (restricted): The exam will take place on campus under invigilation. You may refer only to specific physical materials such as a named textbook during the exam. Permitted materials will be specified by your department. Electronic devices may not be used in the exam.
- In-person, closed book: The exam will take place on campus under invigilation. You may not refer to any physical materials or electronic devices during the exam. There may be times when a paper dictionary,
for example, may be permitted in an otherwise closed book exam. Any exceptions will be specified by your department.
Your department will provide further guidance before your exams.
Module supervisor and teaching staff
Dr Andrew Harrison, email: firstname.lastname@example.org.
Dr Andrew Harrison, Dr Osama Mahmoud & Dr Mario Gutierrez-Roig
Dr Andrew Harrison (email@example.com), Dr Osama Mahmoud (firstname.lastname@example.org), Dr Mario Gutierrez-Roig (email@example.com)
Prof Fionn Murtagh
University of Huddersfield
Professor of Data Science
Available via Moodle
Of 2386 hours, 0 (0%) hours available to students:
2386 hours not recorded due to service coverage or fault;
0 hours not recorded due to opt-out by lecturer(s).
Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can
be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements,
industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist
of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules.
The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.
The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.