CS3TM: Text Mining and Natural Language Processing
Module code: CS3TM
Module provider: Computer Science; School of Mathematical, Physical and Computational Sciences
Credits: 20
Level: Level 3 (Honours)
When you'll be taught: Semester 2
Module convenor: Professor Xia Hong, email: x.hong@reading.ac.uk
Pre-requisite module(s):
Co-requisite module(s):
Pre-requisite or Co-requisite module(s):
Module(s) excluded:
Placement information: NA
Academic year: 2024/5
Available to visiting students: Yes
Talis reading list: Yes
Last updated: 21 May 2024
Overview
Module aims and purpose
The aim of this module is to introduce the field of text mining and natural language processing. A key focus of the module is placed on the theories and practice of processing text data from the aspects of lexicons, syntactics, and semantics.
This module also encourages students to develop a set of professional skills, such as problem solving, critical thinking, scientifical evaluation, creativity, technical report writing, organization and time management, self-reflection.
Module learning outcomes
By the end of the module, it is expected that students will be able to:
- Understand and apply the fundamental principles of text mining and natural language processing;
- Apply methods and algorithms to process different types of textual data;
- Empirically evaluate the performances of methods and algorithms by using accuracy and efficiency metrics;and
- Apply analytical and programming skills through using the existing NLP methods and tool s such as NLTK and scikit-learn (python).
Module content
The module covers the following topics:
- Regular expression, Text Normalization
- N-gram and language model, part-of-speech tagging
- lexical semantics, Word Senses and WordNet
- Syntactic and Semantic parsing
- Text classification, sentiment analysis
- Information extraction including name entity recognition and relation extraction
- Advanced topics: Machine learning for NLP, Word embedding, Hidden Markov model and Viterbi algorithm
Structure
Teaching and learning methods
The lectures will introduce students the theories, concepts and underpinning principles specified in the indicative content. Students will be supervised in the practical sessions to apply the concepts and principles to given problems context for learning.
The lectures and practical sessions will enable students to practice a known NLP software, perform analysis and report writing.
There will also be learning materials in digital forms when they are required to support learning.
There are two types of assessment (i.e., formative assessment and summative assessment) which will support and reinforce students’ learning. Formative assessment is carried out through weekly learning activities either exemplar questions, or sample programmable problems.
Summative assessment consists of one piece of written coursework assignment and one written examination. The written coursework assignment requires students to demonstrate scientific writing of individual report. Appropriate feedback will be timely communicated with students for enhancing learning.
Study hours
At least 38 hours of scheduled teaching and learning activities will be delivered in person, with the remaining hours for scheduled and self-scheduled teaching and learning activities delivered either in person or online. You will receive further details about how these hours will be delivered before the start of the module.
Scheduled teaching and learning activities | Semester 1 | Semester 2 | Summer |
---|---|---|---|
Lectures | 22 | ||
Seminars | 8 | ||
Tutorials | |||
Project Supervision | |||
Demonstrations | |||
Practical classes and workshops | 8 | ||
Supervised time in studio / workshop | |||
Scheduled revision sessions | |||
Feedback meetings with staff | |||
Fieldwork | |||
External visits | |||
Work-based learning | |||
Self-scheduled teaching and learning activities | Semester 1 | Semester 2 | Summer |
---|---|---|---|
Directed viewing of video materials/screencasts | |||
Participation in discussion boards/other discussions | |||
Feedback meetings with staff | |||
Other | |||
Other (details) | |||
Placement and study abroad | Semester 1 | Semester 2 | Summer |
---|---|---|---|
Placement | |||
Study abroad | |||
Independent study hours | Semester 1 | Semester 2 | Summer |
---|---|---|---|
Independent study hours | 162 |
Please note the independent study hours above are notional numbers of hours; each student will approach studying in different ways. We would advise you to reflect on your learning and the number of hours you are allocating to these tasks.
Semester 1 The hours in this column may include hours during the Christmas holiday period.
Semester 2 The hours in this column may include hours during the Easter holiday period.
Summer The hours in this column will take place during the summer holidays and may be at the start and/or end of the module.
Assessment
Requirements for a pass
Students need to achieve an overall module mark of 40% to pass this module.
Summative assessment
Type of assessment | Detail of assessment | % contribution towards module mark | Size of assessment | Submission date | Additional information |
---|---|---|---|---|---|
Online written examination | Exam | 50 | 2 hours | Semester 2 Assessment Period | Answer 3 out of 4 questions |
Set exercise | Technical report | 50 | 7 pages (excluding appendices). 20 hours | Semester 2, Teaching Week 11 |
Penalties for late submission of summative assessment
The Support Centres will apply the following penalties for work submitted late:
Assessments with numerical marks
- where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day (or part thereof) following the deadline up to a total of three working days;
- the mark awarded due to the imposition of the penalty shall not fall below the threshold pass mark, namely 40% in the case of modules at Levels 4-6 (i.e. undergraduate modules for Parts 1-3) and 50% in the case of Level 7 modules offered as part of an Integrated Masters or taught postgraduate degree programme;
- where the piece of work is awarded a mark below the threshold pass mark prior to any penalty being imposed, and is submitted up to three working days after the original deadline (or any formally agreed extension to the deadline), no penalty shall be imposed;
- where the piece of work is submitted more than three working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.
Assessments marked Pass/Fail
- where the piece of work is submitted within three working days of the deadline (or any formally agreed extension of the deadline): no penalty will be applied;
- where the piece of work is submitted more than three working days after the original deadline (or any formally agreed extension of the deadline): a grade of Fail will be awarded.
The University policy statement on penalties for late submission can be found at: https://www.reading.ac.uk/cqsd/-/media/project/functions/cqsd/documents/qap/penaltiesforlatesubmission.pdf
You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.
Formative assessment
Formative assessment is any task or activity which creates feedback (or feedforward) for you about your learning, but which does not contribute towards your overall module mark.
Each topic in a week has defined learning tasks which will enable students to self-reflect on the learning.
Outcomes of the formative assessment for each topic may be given in the guidance tutorial notes, online tests feedback.
Weekly pseudo codes and executable Python codes are given for basic algorithms.
Reassessment
Type of reassessment | Detail of reassessment | % contribution towards module mark | Size of reassessment | Submission date | Additional information |
---|---|---|---|---|---|
Online written examination | Exam | 100 | 3 hours | During the University resit period | Answer 4 out of 6 questions |
Additional costs
Item | Additional information | Cost |
---|---|---|
Computers and devices with a particular specification | ||
Required textbooks | They are specified in Talis. | |
Specialist equipment or materials | ||
Specialist clothing, footwear, or headgear | ||
Printing and binding | ||
Travel, accommodation, and subsistence |
THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.