Programming and Text Analytics with R
Mathematics, Statistics and Actuarial Science (School of)
Postgraduate: Level 7
Monday 18 January 2021
Friday 26 March 2021
10 September 2020
Requisites for this module
MSC G305JS Applied Data Science,
MSC G306JS Data Science and its Applications
The module will introduce the underlying principles and basic concepts of programming with the R language. It will cover a wide range of analytics, provide practical experience of powerful R tools, and present real-world examples of how data and analytics are used to gain insights and to improve a business or industry. These examples include text analytics, Twitter, and IBM Watson.
Throughout these examples, and many more, we will teach programming techniques that will enable students to apply advanced data science approaches to real-world applications.
This module assumes no prior programming skills.
The purpose of this module is to introduce:
Fundamental concepts of programming.
The key aspects of programming using the R language.
Powerful R tools for text analytics.
At the end of this module a student will be able to:
A. A systematic, extensive and comparative knowledge and understanding of different objects and data types in R including character, numeric, factor and logical data.
B. A systematic, extensive and comparative knowledge and understanding of functions in R and create own functions.
C. A comprehensive knowledge and familiarity of R control structures, conditional expressions, and looping techniques.
D. A comprehensive knowledge and familiarity of sentiment using free form text, extract insights, and perform string processing methods.
Introduction to R
What is R? A brief overview of the concepts and features of the R statistical programming environment.
Help systems in R: A description of how to use different sources of R help.
Data types: A brief introduction to different data types in R including numeric, complex, character, factor, and logical data.
Data structure: A summary of data structure in R including vectors, matrices, arrays, data frames and lists.
Importing data: Describing how to import, edit, save, and export data of different formats from R including Excel, SPSS, STATA, and SAS data files.
Data manipulation: A description of how to use logical operators to manipulate data.
Missing values: Describing how R handles missing values.
Visualisation: Creating, editing, and saving graphics in various formats using R.
Programming using R
Functions: What is an R function? how are they structured and used? how can one understand function's parameters and how can we create our own functions?
Control Structures: Describing how we include control structures into R code.
Conditional expressions: Using "if" and "ifelse" structures in R.
Loops: Introducing looping techniques in R, with particular focus on "for", "repeat" and "while" statements.
"apply" family: using "apply", "lapply", "tapply", "mapply" and "sapply" in R.
Text analytics using R
Text as data: understand opinions and intelligence.
Case study: Analysis of tweets on Twitter to understand sentiment and public perception.
This module has 35 contact hours that will be structured as follows:
Lectures: 15 hours
Computer labs: 20 hours
This module does not appear to have any essential texts. To see non-essential items, please refer to the module's reading list.
Assessment items, weightings and deadlines
|Coursework / exam
|Final Project and presentation
Exam format definitions
- Remote, open book: Your exam will take place remotely via an online learning platform. You may refer to any physical or electronic materials during the exam.
- In-person, open book: Your exam will take place on campus under invigilation. You may refer to any physical materials such as paper study notes or a textbook during the exam. Electronic devices may not be used in the exam.
- In-person, open book (restricted): The exam will take place on campus under invigilation. You may refer only to specific physical materials such as a named textbook during the exam. Permitted materials will be specified by your department. Electronic devices may not be used in the exam.
- In-person, closed book: The exam will take place on campus under invigilation. You may not refer to any physical materials or electronic devices during the exam. There may be times when a paper dictionary,
for example, may be permitted in an otherwise closed book exam. Any exceptions will be specified by your department.
Your department will provide further guidance before your exams.
Module supervisor and teaching staff
Dr Osama Mahmoud, email: firstname.lastname@example.org.
Dr Osama Mahmoud & Dr Joe Bailey
Dr Osama Mahmoud (email@example.com), Dr Joe Bailey (firstname.lastname@example.org)
Prof Fionn Murtagh
University of Huddersfield
Professor of Data Science
Available via Moodle
Of 2146 hours, 0 (0%) hours available to students:
2146 hours not recorded due to service coverage or fault;
0 hours not recorded due to opt-out by lecturer(s).
Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can
be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements,
industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist
of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules.
The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.
The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.