MATH3320 - Foundation of Data Analytics - 2022/23
Announcement
- There is no tutorial on Tuesday, September 6, 2022..
- Midterm: 13:00-14:15 (75 minutes), Tuesday, October 25, 2022 (Cheng Yu Tung Building LT1B). The midterm covers notes 1 and chapter 1 of notes 2.
General Information
Lecturer
-
Prof. Zeng Tieyong
- Office: LSB225
- Tel: 39437966
- Email:
Teaching Assistant
-
Zeyu Li
- Office: LSB 222A
- Tel: 3943 3575
- Email:
-
Shen Mao
- Office: AB1 614
- Tel: 3943 4109
- Email:
Time and Venue
- Lecture: Mo 9:30AM - 10:15AM (Mong Man Wai Eng Bldg 404); Tu 12:30PM - 2:15PM (Cheng Yu Tung Building LT1B)
- Tutorial: Tu 11:30AM - 12:15AM (Cheng Yu Tung Building LT1B)
Course Description
This course gives an introduction to computational data analytics, with emphasis on its mathematical foundations. The goal is to carefully develop and explore mathematical theories and methods that make up the backbone of modern mathematical data sciences, such as knowledge discovery in databases, machine learning, and mathematical artificial intelligence. Topics include mathematical foundations of probability, linear approximation and its polynomial and high dimensional extensions, proper orthogonal decomposition methods, optimization, theories of nonlinear neural network and approximations. Students taking this course are expected to have knowledge of basic linear algebra.
Advisory: MATH Majors should select not more than 5 MATH courses in a term.
Textbooks
- Mathematical Foundations for Data Analysis, Jeff M. Phillips, Springer 2021.
- Fundamentals of Data Analytics With a View to Machine Learning, Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi, Springer, 2020
- "Mathematics for Machine Learning" by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, Cambridge University Press.
- Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning, The MIT Press, 2016.
References
- Richard Duda, Peter Hart and David Stock,Pattern Classification, Wiley-Interscience, 2nd Edition, 2015.
- Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014
- Kevin P. Murphy, Machine Learning: A Probabilistic Perspective, The MIT Press, 2012.
- Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
Pre-class Notes
- linear approximation
- Estimation
- Estimation_MLE
- Classfication
- Gradient Descent
- Gradient Descent
- Cross validation
- Bayes
- Bayes Regression
- k-means clustering
- SVM-read this (Nov 15, 2021)
- K-NN
- PCA
- Probability
- Mixtures of Gaussians
- Mixtures of Gaussians (Video)
- Introduction to Deep Learning (MIT)
- Machine learning and Data Mining (Lecture Notes)
- Machine Learning and Data Mining (Course Page)
Lecture Notes
- Notes1 (Basic)
- Notes2 (Standard)
- Notes3 (Advanced)
- SVD (Sept. 20, 2022)
- kNN (Oct 17, 2022)
- Decision Tree (Oct 24, 2022)
- Multivariant Gaussian (Oct 31, 2022)
- PCA (Nov. 8, 2022)
- Introduction to Deep Learning (Nov. 14, 2022)
- SVM (in Chinese) Nov. 21, 2022
- K-means (Nov. 28, 2022)
Class Notes
- Notes on Linear Algebra (Jean Walrand)
- Linear Algebra
- Topics in Matrix Theory(SVD)-Sept9-2021
- More on Multivariate Gaussians (Stanford)
- The Rank-Nullity Theorem
- Spectral Theorem
- Cholesky decomposition-Sept8-2021
- SVD (MIT)-Sept9-2021
- Probability Theory (Introduction)
- Optimization for Machine Learning (ENS)
- General EM algorithm
- SVM
- Machine Learning and Data Mining
Tutorial Notes
- Notes on Linear Algebra
- Notes on SVD
- Notes on Taylor Theorem for High Dimension
- Notes on Optimization and Probability
- Notes on Machine Learning
- Notes on Multivariate Gaussian
- Notes on PCA
- Notes on Backpropagation
- Notes on Linear Classification
Assignments
Quizzes and Exams
Solutions
- Solutions 1
- Solutions 2
- Solutions 3
- Solutions 4
- Solutions 5
- Solutions 6
- Solutions 7
- Solutions 8
- Solutions 9
Assessment Scheme
Tutorial attendance & good efforts | 10% | |
Mid-Exam | 12.5% | |
Project | 12.5% | |
Final Exam | 65% | |
Back-up Plan: In case face-to-face teaching and assessment is not possible due to the pandemic, the assessment will be changed to: Tutorial and homework 30%; Midterm 35% ; Project 35% | % |
Useful Links
- Fundamentals of Data Analytics With a View to Machine Learning
- Mathematical Foundations for Data Analysis
- Foundation of Data Science
- A Comprehensive Guide to Machine Learning
- PCA
- K-means
- K-Medoids
- Mixtures of Gaussian
- scikit-learn Machine Learning in Python
- Mixtures of Gaussian
- Hidden Markov Models
- Support Vector Machines(Andrew Ng)
- Machine Learning(Andrew Ng)
- Hidden Markov Models
- Neural Networks and Introduction to Deep Learning
- CNN-Li Feifei
- Deep Learning (Adrew Ng)
- LSTM
- Introduction to Machine Learning
- Lasso
- Machine Learning for OR & FE (Columbia University)
- CS229: Machine Learning (Stanford)
- Mathematics for Machine Learning
- Introduction to machine learning
- Introduction to Machine Learning
Honesty in Academic Work
The Chinese University of Hong Kong places very high importance on honesty in academic work submitted by students, and adopts a policy of zero tolerance on cheating and plagiarism. Any related offence will lead to disciplinary action including termination of studies at the University. Although cases of cheating or plagiarism are rare at the University, everyone should make himself / herself familiar with the content of the following website:
http://www.cuhk.edu.hk/policy/academichonesty/and thereby help avoid any practice that would not be acceptable.
Assessment Policy Last updated: November 28, 2022 10:29:55