BS231-5-SP-CO:
Computational Data Analysis: R for Life Sciences

The details
2020/21
Life Sciences (School of)
Colchester Campus
Spring
Undergraduate: Level 5
Current
Sunday 17 January 2021
Friday 26 March 2021
15
05 December 2019

 

Requisites for this module
(none)
(none)
(none)
(none)

 

(none)

Key module for

BSC C700 Biochemistry,
BSC C701 Biochemistry (Including Placement Year),
BSC C703 Biochemistry (Including Year Abroad),
BSC CR00 Biochemistry (Including Foundation Year),
BSC C400 Genetics,
BSC C402 Genetics (Including Year Abroad),
BSC C403 Genetics (Including Placement Year),
BSC CK00 Genetics (Including Foundation Year),
BSC C410 Genetics and Genomics,
BSC C411 Genetics and Genomics (Including Placement Year),
BSC C412 Genetics and Genomics (Including Year Abroad),
MSCIC098 Biochemistry and Biotechnology (Including Year Abroad),
MSCIC099 Biochemistry and Biotechnology (Including Placement Year),
MSCICZ99 Biochemistry and Biotechnology

Module description

The amount of data generated by biological experiments is increasing exponentially, mainly due to the development of new powerful technologies for the acquisition of large-scale genetic and genomic data sets. If we would compile the DNA sequence of the human genome into a book, it would be a 200,000 pages book that will take 10 years to read.

Bioinformatics became a compulsory skill for next generation biologists. In recent years, R became the programming language of choice for bioinformatics and biologists in academia and industry are currently using many tools that were developed in R. Computational Data Analysis: R for Life Sciences provides a basic introduction to programming for biologists in R and aims to provide students with the necessary programming skills and hand-on experience in performing data analysis with R. This module would be essential for further bioinformatics courses that students would take in their third year.

Module aims

1. Use the command line for basic operations and to connect to remote servers.
2. Use R in the command line and in R studio, obtain help for functions
3. Understand the role of variables and how to use them and being able to use the appropriate data structure for the data (vectors, matrices, strings, lists and factors)
4. Understanding the role of objects and the environment.
5. Writing functions and understanding when it is needed to write a function
6. Understanding the role of scripts and writing scripts for any analysis
7. Reading and writing data from files stored on the computer
8. Being able to use conditionals and Boolean logic in R
9. Being able to write loops and understanding when to write loops in R
10. Representing data in plots and storing the plots into different file formats
11. Writing documentation with integrated R code
12. Comment the code, strategies to structure the code and debugging
13. Perform correlation and descriptive statistics and interpret the results.
14. Perform statistical tests and interpret the results.
15. Understanding which statistical test is best suited for different questions.

Module learning outcomes

In order to pass this module the student will need to be able to:

1. write scripts and functions in R and comment the code;
2. read and write data files in different formats;
3. use the basic plot functionalities of R;
4. write documentation and examples of how your functions and scripts should be used;
5. perform basic statistical analysis in R (correlation analysis and statistical tests);
6. demonstrate the ability to work as part of a team.

Module information

The amount of data generated by biological experiments is increasing exponentially, mainly due to the development of new powerful technologies for the acquisition of large-scale genetic and genomic data sets. If we would compile the DNA sequence of the human genome into a book, it would be a 200,000 pages book that will take 10 years to read. Bioinformatics became a compulsory skill for next generation biologists. In recent years, R became the programming language of choice for bioinformatics and biologists in academia and industry are currently using many tools that were developed in R. Computational Data Analysis: R for Life Sciences provides a basic introduction to programming for biologists in R and aims to provide students with the necessary programming skills and hand-on experience in performing data analysis with R. This module would be essential for further bioinformatics courses that students would take in their third year.

Learning and teaching methods

Lectures - 12h Workshops - 12 x 2 = 24 h

Bibliography*

  • Richard Cotton. (2013) Learning R, Sebastopol, CA: O'Reilly.
  • Andy Hector. (2015) The new statistics with R: an introduction for biologists, Oxford: Oxford University Press.

The above list is indicative of the essential reading for the course. The library makes provision for all reading list items, with digital provision where possible, and these resources are shared between students. Further reading can be obtained from this module's reading list.

Assessment items, weightings and deadlines

Coursework / exam Description Deadline Weighting
Coursework   Code Assignment 1    30% 
Coursework   Code Assignment 2    35% 

Overall assessment

Coursework Exam
100% 0%

Reassessment

Coursework Exam
100% 0%
Module supervisor and teaching staff
Dr Nicolae Zabet, email: nzabet@essex.ac.uk.
Dr Nicolae Radu Zabet, Prof Leo Schalkwyk, Dr Toni Marco
School Undergraduate Office, email: bsugoffice (Non essex users should add @essex.ac.uk to create the full email address)

 

Availability
No
No
No

External examiner

No external examiner information available for this module.
Resources
Available via Moodle
Of 45 hours, 45 (100%) hours available to students:
0 hours not recorded due to service coverage or fault;
0 hours not recorded due to opt-out by lecturer(s).

 

Further information
Life Sciences (School of)

* Please note: due to differing publication schedules, items marked with an asterisk (*) base their information upon the previous academic year.

Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements, industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules. The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.

The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.