MA331-7-AU-CO:
Programming and Text Analytics with R

The details
2023/24
Mathematics, Statistics and Actuarial Science (School of)
Colchester Campus
Autumn
Postgraduate: Level 7
Current
Thursday 05 October 2023
Friday 15 December 2023
15
15 February 2024

 

Requisites for this module
(none)
(none)
(none)
(none)

 

(none)

Key module for

MSC G30512 Applied Data Science,
MSC G30524 Applied Data Science,
MSC G30612 Data Science and its Applications

Module description

This module will introduce the underlying principles and basic concepts of programming with the R language. It will cover a wide range of analytics, provide practical experience of powerful R tools, and present real-world examples of how data and analytics are used to gain insights and to improve a business or industry. These examples include text analytics, Twitter, and Gutenberg digital public domain texts.


Throughout these examples, and many more, we will teach programming techniques that will enable students to apply advanced data science approaches to real-world applications. This module assumes no prior programming skills.

Module aims

The aims of this module are:



  • To introduce the fundamental concepts of programming.

  • To introduce the key aspects of programming using the R language.

  • To introduce powerful R tools for text analytics.

Module learning outcomes

By the end of this module, students will be expected to have:



  1. A comprehensive understanding of techniques applicable to recognise different objects and data types in R including character, numeric, factor and logical data.

  2. A comprehensive understanding of techniques applicable to use functions in R and create own functions.

  3. A comprehensive understanding of techniques applicable to implement R control structures, conditional expressions, and looping techniques.

  4. A comprehensive understanding of techniques applicable to analyse sentiment using free form text, extract insights, and perform string processing methods.

  5. Conceptual understanding that enables the student to summarise sentiment analyses and natural language processing.

  6. Conceptual understanding that enables the student how to solve and to document a complex coding project.

Module information

Syllabus



  • Introduction to R.

  • What is R? A brief overview of the concepts and features of the R statistical programming environment.

  • Help systems in R: A description of how to use different sources of R help.

  • Data types: A brief introduction to different data types in R including numeric, complex, character, factor, and logical data.

  • Data structure: A summary of data structure in R including vectors, matrices, arrays, data frames and lists.

  • Importing data: Describing how to import, edit, save, and export data of different formats from R including Excel, SPSS, STATA, and SAS data files.

  • Data manipulation: A description of how to use logical operators to manipulate data.

  • Missing values: Describing how R handles missing values.

  • Visualisation: Creating, editing, and saving graphics in various formats using R.

  • Programming using R.

  • Functions: What is an R function? how are they structured and used? how can one understand function`s parameters and how can we create our own functions?

  • Control Structures: Describing how we include control structures into R code.

  • Conditional expressions: Using `if` and `ifelse` structures in R.

  • Loops: Introducing looping techniques in R, with particular focus on `for`, `repeat` and `while` statements.

  • `apply` family: using `apply`, `lapply`, `tapply`, `mapply` and `sapply` in R.

  • Text analytics using R.

  • Text as data: understand opinions and intelligence.

  • Case study: Analysis of tweets on Twitter to understand sentiment and public perception.

  • Sentiment analysis.

Learning and teaching methods

Teaching in the School will be delivered using a range of face to face lectures, classes and lab sessions as appropriate for each module. Modules may also include online only sessions where it is advantageous, for example for pedagogical reasons, to do so.

Bibliography

The above list is indicative of the essential reading for the course.
The library makes provision for all reading list items, with digital provision where possible, and these resources are shared between students.
Further reading can be obtained from this module's reading list.

Assessment items, weightings and deadlines

Coursework / exam Description Deadline Coursework weighting
Coursework   Lab Test    40% 
Coursework   Lab test    60% 

Exam format definitions

  • Remote, open book: Your exam will take place remotely via an online learning platform. You may refer to any physical or electronic materials during the exam.
  • In-person, open book: Your exam will take place on campus under invigilation. You may refer to any physical materials such as paper study notes or a textbook during the exam. Electronic devices may not be used in the exam.
  • In-person, open book (restricted): The exam will take place on campus under invigilation. You may refer only to specific physical materials such as a named textbook during the exam. Permitted materials will be specified by your department. Electronic devices may not be used in the exam.
  • In-person, closed book: The exam will take place on campus under invigilation. You may not refer to any physical materials or electronic devices during the exam. There may be times when a paper dictionary, for example, may be permitted in an otherwise closed book exam. Any exceptions will be specified by your department.

Your department will provide further guidance before your exams.

Overall assessment

Coursework Exam
100% 0%

Reassessment

Coursework Exam
100% 0%
Module supervisor and teaching staff
Dr Osama Mahmoud, email: o.mahmoud@essex.ac.uk.
Dr Osama Mahmoud
o.mahmoud@essex.ac.uk

 

Availability
No
No
No

External examiner

Dr Yinghui Wei
University of Plymouth
Resources
Available via Moodle
Of 74 hours, 61 (82.4%) hours available to students:
8 hours not recorded due to service coverage or fault;
5 hours not recorded due to opt-out by lecturer(s), module, or event type.

 

Further information

Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements, industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules. The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.

The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.