Internal

CS2PP22NU - Programming in Python for Data Science

CS2PP22NU-Programming in Python for Data Science

Module Provider: School of Mathematical, Physical and Computational Sciences
Number of credits: 10 [5 ECTS credits]
Level:5
Semesters in which taught: Semester 2 module
Pre-requisites:
Non-modular pre-requisites:
Co-requisites:
Modules excluded:
Current from: 2023/4

Module Convenor: Dr Todd Jones
Email: t.r.jones@reading.ac.uk

NUIST Module Lead: Wenwen Liu
Email: w.liu@nuist.edu.cn

Type of module:

Summary module description:

The module introduces students to the Python programming language and the Python data science library ecosystem through application of programming fundamentals, data processing, and machine learning techniques.  Data manipulation and statistical data science methods are also covered.


Aims:

The aim of the module is to introduce students to the Python programming language and enable them to master the basics of programming while working with current tools used in data science and general program design and development.



This module also encourages students to develop a set of professional skills, such as problem solving, creativity, team working, technical report writing for technical and non-technical audiences, self-reflection, effective use of commercial software, organisation, time management, numeracy, hypothesis generation and hypothesis testing. 


Assessable learning outcomes:

On completion of this module, students will be able to: 




  • Implement common computer science algorithms in the Python programming language; 

  • Demonstrate an understanding of the use of functional and object-oriented programming paradigms in Python; 

  • Read and manipulate data in several formats to extract specific features; 

  • Assemble, implement, and select appropriate data science methodologies in Python;

  • Employ third-party Python libraries appropriately to design and create well-structured programs for practical applications. 


Additional outcomes:

Students will gain generally improved programming skills and a deeper understanding of the wider Python ecosystem and tools.


Outline content:

The course begins with an introduction to the Python programming language and the Python library ecosystem.  Students will perform a series of practical exercises designed to develop skill in Python scripting and wider program development. These will incorporate aspects of data analysis and professional and scientific research techniques.



The Python language will be covered in depth, including:




  • Data types, operators, and flow control

  • Functional and object-oriented programming

  • Using DataFrames to organise and manipulate data with Pandas

  • Working with matrices and arrays using NumPy

  • Data visualisation with Matplotlib

  • Analysing data using scikit-learn

  • Handling data with widely used, open-source Python libraries



Example application to data science:




  • Regression

  • Clustering

  • Classification

  • Network (graph) analysis


Brief description of teaching and learning methods:

The module consists of weekly lectures and practical sessions, where students will be encouraged to collaborate with their peers to develop solutions to a series of problems.  Skills gained in the lectures and practical sessions will be applied two pieces of assessment in the form of set programming exercises and related technical reporting of analysis results.


Contact hours:
  Semester 1 Semester 2
Lectures 20
Practicals classes and workshops 20
Guided independent study:    
    Wider reading (independent) 10
    Wider reading (directed) 10
    Peer assisted learning 10
    Preparation of practical report 10
    Completion of formative assessment tasks 15
    Reflection 5
     
Total hours by term 0 100
     
Total hours for module 100

Summative Assessment Methods:
Method Percentage
Set exercise 100

Summative assessment- Examinations:

N/A


Summative assessment- Coursework and in-class tests:


  • Individual set of programming exercises related to key aspects of Python (40%)

  • Group assignment related to the application of Python to data science (60%)


Formative assessment methods:

The weekly practical sessions are used for conducting the formative assessment where feedback is provided to help develop understanding and enhance programming skills throughout the term. 


Penalties for late submission:

The Support Centres will apply the following penalties for work submitted late:

  • where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day (or part thereof) following the deadline up to a total of five working days;
  • where the piece of work is submitted more than five working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.
The University policy statement on penalties for late submission can be found at: https://www.reading.ac.uk/cqsd/-/media/project/functions/cqsd/documents/cqsd-old-site-documents/penaltiesforlatesubmission.pdf
You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.

Assessment requirements for a pass:

A mark of 40% overall.


Reassessment arrangements:

One 2-hour examination paper in August/September.


Additional Costs (specified where applicable):

1) Required text books: 

2) Specialist equipment or materials: 

3) Specialist clothing, footwear or headgear: 

4) Printing and binding: 

5) Computers and devices with a particular specification: 

6) Travel, accommodation and subsistence: 


Last updated: 18 April 2023

THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.

Things to do now