Internal

ST4SML - Statistical Data Science and Machine Learning

ST4SML-Statistical Data Science and Machine Learning

Module Provider: Mathematics and Statistics
Number of credits: 10 [5 ECTS credits]
Level:7
Terms in which taught: Spring term module
Pre-requisites: MA1MSP Mathematical and Statistical Programming MA2MPR Mathematical Programming ST1PS Probability and Statistics
Non-modular pre-requisites:
Co-requisites:
Modules excluded: ST3SML Statistical Data Science and Machine Learning
Current from: 2021/2

Module Convenor: Dr Fazil Baksh
Email: m.f.baksh@reading.ac.uk

Type of module:

Summary module description:

The topics of Data Science, Machine Learning and Artificial Intelligence have recently become part  of the public consciousness, in part due to their successful application in industry (most notably at large technology companies). Many of the most successful techniques used in these fields are underpinned by statistical techniques. This module begins by covering some of these underpinning techniques, and shows how they may be applied to problems in Data Science and Machine Learning.


Aims:

This module aims to give students a solid understanding of the types of methods that are used in Statistical Machine Learning, and the ability to implement and use some of them. It also aims to connect students with research being conducted in this area.


Assessable learning outcomes:

By the end of the module it is expected that the student will be able to:




  • use and explain underpinning statistical methods for Data Science and Machine Learning;

  • produce software implementation of the methods taught in the module;

  • use statistical learning tools to build and evaluate algorithms for supervised learning .



This module will be assessed to a greater depth than the excluded module ST3SML.


Additional outcomes:

The student will also gain experience of reading the scientific literature and learning about current research.


Outline content:

The module will begin with an introduction to Data Science, Machine Learning and Artificial Intelligence, then describe the ideas that underpin the statistical approach to these topics. The module focuses on Machine Learning, covering the topics of regression and classification, including: linear and logistic regression; linear and quadratic discriminant analysis; resampling methods; model selection and regularisation; ridge regression; lasso; dimension reduction me thods; principal components regression; partial least squares; high dimensional problems; regression splines; generalised additive models; tree-based methods; bagging; stacking; random forests; boosting; neural networks and deep learning; support vector machines.


Brief description of teaching and learning methods:

The core material will be delivered in 16 lectures. These will be supported by material from the book "An Introduction to Statistical Learning with Applications in R" that is freely available online along with research articles, and blog posts.



This range of sources will be used to give students exposure to the way a Data Scientist working in industry or academia would learn their subject. This will provide st udents who are interested in the area a path to explore the subject more widely, whilst being supported by being provided with an easy-to-follow path through the material.



There will be 4 practical PC lab sessions spread in between the lectures. Each will give the students the chance to learn to code up concepts covered in the lectures.



There will be one assignment, handed out at the beginning of the module, and due in at the end. The assignment will consi st of problems that one will need to use software implementations of the algorithms in the module in order to solve. PC labs will cover problems that are very close to those given in the assignment, in order to motivate students to attend the PC labs, and engage with the module as it progresses.



Additional support with programming will be offered where required.


Contact hours:
  Autumn Spring Summer
Lectures 16
Practicals classes and workshops 4
Guided independent study: 80
       
Total hours by term 0 100 0
       
Total hours for module 100

Summative Assessment Methods:
Method Percentage
Written exam 100

Summative assessment- Examinations:

One exam, 2 hours


Summative assessment- Coursework and in-class tests:

Formative assessment methods:

Feedback given during practicals.


Penalties for late submission:

The Support Centres will apply the following penalties for work submitted late:

  • where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day (or part thereof) following the deadline up to a total of five working days;
  • where the piece of work is submitted more than five working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.
The University policy statement on penalties for late submission can be found at: http://www.reading.ac.uk/web/FILES/qualitysupport/penaltiesforlatesubmission.pdf
You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.

Assessment requirements for a pass:

A mark of 50% overall.


Reassessment arrangements:

One examination paper of 2 hours duration in August/September.


Additional Costs (specified where applicable):

1) Required text books: None

2) Specialist equipment or materials: None

3) Specialist clothing, footwear or headgear: None

4) Printing and binding: None

5) Computers and devices with a particular specification: None

6) Travel, accommodation and subsistence: None


Last updated: 28 June 2021

THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.

Things to do now