CSMBD16-Big Data Analytics
Module Provider: Computer Science
Number of credits: 10 [5 ECTS credits]
Level:7
Terms in which taught: Spring term module
Pre-requisites:
Non-modular pre-requisites:
Co-requisites: CSMCC16 Cloud Computing CSMDM16 Data Analytics and Mining
Modules excluded:
Current from: 2019/0
Type of module:
Summary module description:
This module covers the topic of Big Data.
Aims:
The analysis of Big Data is not just the analysis of very large data sources, even though this is part of it. Typically data comprises four aspects, Volume, Velocity, Variety, and Veracity. This view on Big Data is commonly accepted. Volume refers to the actual size of the data, here computationally well scaling methods are needed; Velocity refers to the very fast generation of data, here data stream processing methods are needed for time critical applications; Variety refers to the different types of data, possibly unstructured data such as video streams, click streams or audio files; Veracity refers to the challenge of establishing the trust of decision makers in the Knowledge extracted from Big Data Analytics techniques.
This unit’s aim is to address these aspects and challenges of Big Data Analytics by introducing scalable parallel data mining algorithms which can be executed on computer clusters such as Hadoop; the introduction of data stream mining techniques and algorithms for the analysis of high velocity data; the introduction to sentiment analysis techniques for unstructured data such as micro-blogging data and social network data; and the introduction of scalable recommender systems. A further aim of the unit is to introduce software systems used for Big Data Analytics such as KNIME, MOA, MapReduce and Spark.
Assessable learning outcomes:
1. The students will be able to discuss, identify and describe challenges of Big Data Analytics. Furthermore the students will be able to appraise relevant algorithms, tools and techniques to tackle these challenges.
2. The students will learn how to apply Big Data Analytics techniques and algorithms to solve challenges in Big Data Analytics.
3. The students will be able to analyse complex Big Data Analytics problems, develop and appraise analytics tech niques to tackle the problems and evaluate solutions.
4. The students will learn how to redefine and modify solutions from analytics problems, so they can be applied to new but similar problems.
Additional outcomes:
The students will recognise real world applications of Big Data Analytics and also demonstrate how to deploy and evaluate data mining applications for Big Data on computer clusters.
Outline content:
• Introduction to Big Data Analytics principles and challenges;
• Data mining techniques and tools for Large Data Set Analysis, in particular parallel data mining techniques;
• Data mining algorithms and tools for the analysis of fast streaming real time data;
• Data mining techniques for building recommender systems;
• Date mining techniques and algorithms for unstructured data analysis.<
/p>
Reading List: Essential Text:
Data Mining, Concepts and Techniques, (Second Edition) Jiawei Han, Micheline Kamber Morgan Kaufmann Publishers, March 2006. ISBN: 978-1-55860-901-3
Mahout in Action Sean Owen, Robin Anil, Ted Dunning, and Ellen Friedman ISBN 9781935182689
Further reading:
Data Mining: Practical Machine Learning Tools and Tec hniques (Second Edition) Ian H. Witten, Eibe Frank
Brief description of teaching and learning methods:
The module comprises lectures (20 hours), practical sessions (10 hours) and a major coursework. The lectures introduce the basic concepts, methodologies of advanced Data Analytics. The students will gain more insights and skills in the taught subjects through reading assignments and hands-on activities on Big Data Analytics through practical sessions. A final coursework will allow the students to apply some of the concepts learned to a practical case.
Autumn | Spring | Summer | |
Lectures | 20 | ||
Practicals classes and workshops | 10 | ||
Guided independent study: | 70 | ||
Total hours by term | 100 | ||
Total hours for module | 100 |
Method | Percentage |
Written exam | 50 |
Project output other than dissertation | 50 |
Summative assessment- Examinations:
1.5 hours.
Summative assessment- Coursework and in-class tests:
• Final project (50%);
• Final exam: one hour and half hour paper comprising module-related questions (50%).
Formative assessment methods:
Penalties for late submission:
Penalties for late submission on this module are in accordance with the University policy. Please refer to page 5 of the Postgraduate Guide to Assessment for further information: http://www.reading.ac.uk/internal/exams/student/exa-guidePG.aspx
Assessment requirements for a pass:
A mark of 50% overall.
Reassessment arrangements:
Resit by examination.
Additional Costs (specified where applicable):
1) Required text books:
2) Specialist equipment or materials:
3) Specialist clothing, footwear or headgear:
4) Printing and binding:
5) Computers and devices with a particular specification:
6) Travel, accommodation and subsistence:
Last updated: 7 February 2020
THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.