CSMDM21-Data Analytics and Mining
Module Provider: Computer Science
Number of credits: 20 [10 ECTS credits]
Level:7
Terms in which taught: Autumn term module
Pre-requisites:
Non-modular pre-requisites:
Co-requisites:
Modules excluded:
Current from: 2023/4
Module Convenor: Dr Carmen Lam
Email: carmen.lam@reading.ac.uk
Type of module:
Summary module description:
This module covers data analytics and data mining.
Aims:
Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories. Automated data analytics and mining techniques are becoming essential components to any information systems. In the Knowledge Discovery process large data sets have to be cleaned, pre-processed, selected, merged, etc., and finally processed for the automatic extraction of interesting knowledge, such as descriptive and predictive models. The techniques span from statistics to machine learning and information science.
This module focuses on concepts, methodologies, algorithms and tools for the design, management and deployment of the Knowledge Discovery process. In particular, tools for data analytics (R) and workflow management (KNIME) will be adopted for hands-on activities on several test cases. Students will learn general Data Mining principles and techniques and will apply them in different applicative domains.
This module also encourages students to develop a set of professional skills such as problem solving, critical analysis of published literature, creativity, technical report writing for technical and non-technical audiences, professional communication (email; letters; minutes etc.), self-reflection and effective use of commercial software.
Assessable learning outcomes:
Students are expected to be able to
- understand general data mining concepts, principles and algorithms; understand the general Knowledge Discovery process.
- use the data mining methodologies to apply various algorithms and tools for descriptive and predictive analytics;
- design and execute a Knowledge Discovery process for specific data mining problems using state-of-the-art open source software;
- apply the data mining tools in different applicative domains.
Additional outcomes:
Students will become familiar with the potential applications of data mining techniques in different domains. They will also learn how to carry out experimental tests for algorithm performance evaluations.
Outline content:
- Introduction to the Knowledge Discovery process;
- Data selection, pre-processing and cleaning;
- Data mining algorithms (classification, clustering, etc.);
- Workflow management systems (KNIME);
- The R Project for Statistical Computing.
Brief description of teaching and learning methods:
The module comprises lectures (20 hours), practical sessions (10 hours) and a major coursework. The lectures introduce the basic concepts, algorithms and tools for Data Analytics and Mining. During the practical sessions, tools for data analytics and workflow management will be adopted for hands-on activities on several test cases. A final project allows the students to apply the concepts to a practical case.
Recommended Text:
Introduction to Data Mining Pang-Ning Tan, Michael Steinbach, Vipin Kumar Addison-Wesley ISBN- 10: 0321420527, ISBN 13:9780321420527
Data Mining: Practical Machine Learning Tools and Techniques (Second Edition) Ian H Witten, Eibe Frank Morgan Kaufmann ISBN 0-12- 088407-0
Data Mining, Concepts and Techniques, Second Edition Jiawei Han, Micheline Kamber Morgan Kaufmann Publishers, March 2006 ISBN 978-1-55860-901-3 ISBN 10:1-55860-901-6
Autumn | Spring | Summer | |
Lectures | 20 | ||
Practicals classes and workshops | 10 | ||
Guided independent study: | |||
Wider reading (independent) | 20 | ||
Wider reading (directed) | 20 | ||
Advance preparation for classes | 30 | ||
Preparation for tutorials | 20 | ||
Preparation of practical report | 30 | ||
Carry-out research project | 20 | ||
Essay preparation | 20 | ||
Reflection | 10 | ||
Total hours by term | 200 | 0 | 0 |
Total hours for module | 200 |
Method | Percentage |
Set exercise | 100 |
Summative assessment- Examinations:
Summative assessment- Coursework and in-class tests:
One project-based assignment.
Formative assessment methods:
Penalties for late submission:
The below information applies to students on taught programmes except those on Postgraduate Flexible programmes. Penalties for late submission, and the associated procedures, which apply to Postgraduate Flexible programmes are specified in the policy 'Penalties for late submission for Postgraduate Flexible programmes', which can be found here: https://www.reading.ac.uk/cqsd/-/media/project/functions/cqsd/documents/cqsd-old-site-documents/penaltiesforlatesubmissionpgflexible.pdf
The Support Centres will apply the following penalties for work submitted late:
- where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day (or part thereof) following the deadline up to a total of five working days;
- where the piece of work is submitted more than five working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.
You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.
Assessment requirements for a pass:
A mark of 50% overall.
Reassessment arrangements:
One 3-hour examination paper in August/September.
Additional Costs (specified where applicable):
1) Required text books:
2) Specialist equipment or materials:
3) Specialist clothing, footwear or headgear:
4) Printing and binding:
5) Computers and devices with a particular specification:
6) Travel, accommodation and subsistence:
Last updated: 30 March 2023
THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.