BS231-5-AU-CO:
Computational Data Analysis: R for Life Sciences

The details
2023/24
Life Sciences (School of)
Colchester Campus
Autumn
Undergraduate: Level 5
Current
Thursday 05 October 2023
Friday 15 December 2023
15
15 February 2024

 

Requisites for this module
(none)
(none)
(none)
(none)

 

(none)

Key module for

BSC C700 Biochemistry,
BSC C701 Biochemistry (Including Placement Year),
BSC C703 Biochemistry (Including Year Abroad),
BSC CR00 Biochemistry (Including Foundation Year),
BSC C400 Genetics,
BSC C402 Genetics (Including Year Abroad),
BSC C403 Genetics (Including Placement Year),
BSC CK00 Genetics (Including Foundation Year),
MSCIC098 Biochemistry and Biotechnology (Including Year Abroad),
MSCIC099 Biochemistry and Biotechnology (Including Placement Year),
MSCICZ99 Biochemistry and Biotechnology

Module description

The amount of data generated by biological experiments is increasing exponentially, mainly due to the development of new powerful technologies for the acquisition of large-scale genetic and genomic data sets. If we would compile the DNA sequence of the human genome into a book, it would be a 200,000 pages book that will take 10 years to read.

Bioinformatics became a compulsory skill for next generation biologists. In recent years, R became the programming language of choice for bioinformatics and biologists in academia and industry are currently using many tools that were developed in R. Computational Data Analysis: R for Life Sciences provides a basic introduction to programming for biologists in R and aims to provide students with the necessary programming skills and hand-on experience in performing data analysis with R. This module would be essential for further bioinformatics courses that students would take in their third year.

Module aims

1. Use the command line for basic operations
2. Use R in the command line and in R studio, obtain help for functions
3. Understand the role of variables and how to use them and being able to use the appropriate data structure for the data (vectors, matrices, strings, lists and factors)
4. Understanding the role of objects and the environment.
5. Writing functions and understanding when it is needed to write a function
6. Understanding the role of scripts and writing scripts for any analysis
7. Reading and writing data from files stored on the computer
8. Being able to use conditionals and Boolean logic in R
9. Being able to write loops and understanding when to write loops in R
10. Representing data in plots and storing the plots into different file formats
11. Writing documentation with integrated R code
12. Comment code and strategies to structure code clearly
13. Perform correlation and descriptive statistics and interpret the results.
14. Perform statistical tests and interpret the results.
15. Understanding which statistical test is best suited for different questions.

Module learning outcomes

In order to pass this module the student will need to be able to:

1. Effectively communicate analyses by writing scripts and functions in R and commenting the code;
2. Attain knowledge of the key methods for reading and writing data files in different formats into R;
3. Using and critically evaluating the key plotting functionalities of R;
4. Apply principles from software development to document and demonstrate how your functions and scripts should be used;
5. Understand and apply functions to perform basic statistical analyses in R (correlation analysis and statistical tests);
6. Demonstrate essential transferable skills and qualities needed to work successfully as part of a team.

Module information

The amount of data generated by biological experiments is increasing exponentially, mainly due to the development of new powerful technologies for the acquisition of large-scale genetic and genomic data sets. If we would compile the DNA sequence of the human genome into a book, it would be a 200,000 pages book that will take 10 years to read. Bioinformatics became a compulsory skill for next generation biologists. In recent years, R became the programming language of choice for bioinformatics and biologists in academia and industry are currently using many tools that were developed in R. Computational Data Analysis: R for Life Sciences provides a basic introduction to programming for biologists in R and aims to provide students with the necessary programming skills and hand-on experience in performing data analysis with R. This module would be essential for further bioinformatics courses that students would take in their third year.

Learning and teaching methods

Lectures - 12h Workshops - 12 x 2 = 24 h

Bibliography

The above list is indicative of the essential reading for the course.
The library makes provision for all reading list items, with digital provision where possible, and these resources are shared between students.
Further reading can be obtained from this module's reading list.

Assessment items, weightings and deadlines

Coursework / exam Description Deadline Coursework weighting
Coursework   Assessment 1 - Worksheet    70% 
Coursework   Assessment 2 - Group Project    25% 
Practical   Attendance    5% 

Exam format definitions

  • Remote, open book: Your exam will take place remotely via an online learning platform. You may refer to any physical or electronic materials during the exam.
  • In-person, open book: Your exam will take place on campus under invigilation. You may refer to any physical materials such as paper study notes or a textbook during the exam. Electronic devices may not be used in the exam.
  • In-person, open book (restricted): The exam will take place on campus under invigilation. You may refer only to specific physical materials such as a named textbook during the exam. Permitted materials will be specified by your department. Electronic devices may not be used in the exam.
  • In-person, closed book: The exam will take place on campus under invigilation. You may not refer to any physical materials or electronic devices during the exam. There may be times when a paper dictionary, for example, may be permitted in an otherwise closed book exam. Any exceptions will be specified by your department.

Your department will provide further guidance before your exams.

Overall assessment

Coursework Exam
100% 0%

Reassessment

Coursework Exam
100% 0%
Module supervisor and teaching staff
Dr David Clark, email: david.clark@essex.ac.uk.
Dr Dave Clark, Dr Ben Skinner, Dr Martin Wilkes
School Undergraduate Office, email: bsugoffice (Non essex users should add @essex.ac.uk to create the full email address)

 

Availability
Yes
No
No

External examiner

Dr Thomas Clarke
University of East Anglia
Senior lecturer/associate professor
Resources
Available via Moodle
Of 93 hours, 84 (90.3%) hours available to students:
3 hours not recorded due to service coverage or fault;
0 hours not recorded due to opt-out by lecturer(s), module, or event type.

 

Further information
Life Sciences (School of)

Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements, industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules. The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.

The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.