Exploratory Data Analysis and Data Visualisation
Mathematics, Statistics and Actuarial Science (School of)
Undergraduate: Level 6
Sunday 17 January 2021
Friday 26 March 2021
15 July 2020
Requisites for this module
BSC 5B43 Statistics (Including Year Abroad),
BSC 9K12 Statistics,
BSC 9K13 Statistics (Including Placement Year),
BSC 9K18 Statistics (Including Foundation Year)
In a world increasingly driven by data, the need for analysis and visualisation is more important than ever. In this course we will look at data through the eyes of a numerical detective. We will work on the lost art of exploratory data analysis, reviewing appropriate methods for data summaries with the aim to summarise, understand, extract hidden patterns and identify relationships. We will then work on graphical data analysis, using simple graphs to understand the data, but also advanced complex methods to scrutinise data and interactive plots to communicate data information to a wider audience.
For data analysis and visualisations we will use R-studio, and a combination of R-shiny applications and google visualisations for interactive plotting.
The aim of the course will be to create data analysts that can identify patterns and display information from data of several sources. The course will encourage statistical thinking by a series of examples of good and not-so-good visualisations and will guide students to develop their creativity within a scientific framework.
At the end of the course students will be able to:
- Summarise and understand information on categorical and continuous variables
- Explore relationships between different variables
- Display graphical information and complex relationships in datasets using R
- Use advanced statistical packages like ggplot2 and produce statistical reports with Rmarkdown
- Create interactive graphs with R shiny and googleVis
-Historical examples of visualization with a particular focus on the role of visualization in the development of the scientific worldview
- Data Visualization for Human Perception
- What makes a good graph – What makes a bad graph
- Examining variables and basic R charts
- Exploring relationships, looking for structure
- Advanced plots with ggplot2
- Creating statistical reports with Rmarkdown
- Interactive graphs
- Telling a story
- High dimensional data visualization
Teaching will be delivered in a way that blends face-to-face classes, for those students that can be present on campus, with a range of online lectures, teaching, learning and collaborative support.
- Cairo, Alberto. (2020-10-13) How Charts Lie, New York: WW Norton & Co.
The above list is indicative of the essential reading for the course. The library makes provision for all reading list items, with digital provision where possible, and these resources are shared between students. Further reading can be obtained from this module's reading list.
Assessment items, weightings and deadlines
|Coursework / exam
Exam format definitions
- Remote, open book: Your exam will take place remotely via an online learning platform. You may refer to any physical or electronic materials during the exam.
- In-person, open book: Your exam will take place on campus under invigilation. You may refer to any physical materials such as paper study notes or a textbook during the exam. Electronic devices may not be used in the exam.
- In-person, open book (restricted): The exam will take place on campus under invigilation. You may refer only to specific physical materials such as a named textbook during the exam. Permitted materials will be specified by your department. Electronic devices may not be used in the exam.
- In-person, closed book: The exam will take place on campus under invigilation. You may not refer to any physical materials or electronic devices during the exam. There may be times when a paper dictionary,
for example, may be permitted in an otherwise closed book exam. Any exceptions will be specified by your department.
Your department will provide further guidance before your exams.
Module supervisor and teaching staff
Dr Andrew Harrison, email: firstname.lastname@example.org.
Dr Andrew Harrison, Dr Osama Mahmoud & Dr Xinan Yang
Dr Andrew Harrison (email@example.com), Dr Osama Mahmoud (firstname.lastname@example.org), Dr Xinan Yang (email@example.com)
Prof Fionn Murtagh
University of Huddersfield
Professor of Data Science
Available via Moodle
Of 1691 hours, 0 (0%) hours available to students:
1691 hours not recorded due to service coverage or fault;
0 hours not recorded due to opt-out by lecturer(s).
Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can
be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements,
industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist
of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules.
The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.
The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.