Lecture | Tutorial | |
---|---|---|
Time | M2-4, 9:30 am - 12:30 pm | T3 10:30 am - 11:15 am |
Venue | KKB101 | YIA LT7 |
The Golden Rule of CSCI5510: No member of the CSCI5510 community shall take unfair advantage of any other member of the CSCI5510 community.
This course aims at teaching students the state-of-the-art big data analytics, including techniques, software, applications, and perspectives with massive data. The class will cover, but not be limited to, the following topics: distributed file systems such as Google File System, Hadoop Distributed File System, CloudStore, and map-reduce technology; similarity search techniques for big data such as minhash, locality-sensitive hashing; specialized processing and algorithms for data streams; big data search and query technology; big graph analysis; recommendation systems for Web applications. The applications may involve business applications such as online marketing, computational advertising, location-based services, social networks, recommender systems, healthcare services, also covered are scientific and astrophysics applications such as environmental sensor applications, nebula search and query, etc.
本課程旨在教導學生最先進的針對大數據的分析,包括技術、軟件、應用和遠景。本課程內容將包括,但不限於以下內容:分佈式文件系統如谷歌文件系統,Hadoop文件系統,CloudStore等和Map-reduce技術;大數據的相似搜索技術,如最小哈希,局部敏感哈希等;針對數據流的專門處理方法和算法;大數據的搜索和查詢技術;互聯網應用中的廣告管理和推薦系統。本課涉及的應用程序可能包括商業應用程序,如網絡營銷、計算廣告、基於位置的服務、社交網絡、推薦系統、醫療保健服務和科學及天體物理學領域的應用,如環境傳感器的應用,星雲搜索和查詢等。
At the end of the course of studies, students will have acquired the ability to
Lecturer | Lecturer | Tutor | Tutor | |
---|---|---|---|---|
Name | Irwin King | Michael R. Lyu | Guang Ling | Chen Cheng |
king AT cse.cuhk.edu.hk | lyu AT cse.cuhk.edu.hk | gling AT cse.cuhk.edu.hk | ccheng AT cse.cuhk.edu.hk | |
Office | Rm 908 | Rm 927 | Rm 1024 | Rm 1024 |
Telephone | 3943 8398 | 3943 8429 | 3943 4252 | 3943 4252 |
Office Hour(s) | TBA | 10:00-12:00 Tuesday | TBA | TBA |
Note: This class will be taught in English. Homework assignments and examinations will be conducted in English.
The pdf files are created in Acrobat 6.0. Please obtain the correct version of the Acrobat Reader from Adobe.
Week | Date | Topics | Tutorials | Homework & Events | Resources |
---|---|---|---|---|---|
1 | 2/9 | Introduction and Motivation 01.pptx | No Tutorial | Ch. 1 of MMDS | |
2 | 9/9 | MapReduce 02-MapReduce.pdf | | | Ch. 2 of MMDS Ch. 6 of MMDS |
3 | 16/9 | Locality Sensitive Hashing 03-lsh.pdf | | Ch. 3 of MMDS | |
4 | 23/9 | Mining Data Streams 04-stream.pdf | Ch. 4 of MMDS | ||
5 | 30/9 | Scalable Clustering 05-clustering.pdf | Ch. 7 of MMDS | ||
6 | 7/10 | Dimensionality Reduction 06-DR.pdf | Ch. 11 of MMDS | ||
7 | 14/10 | Public Holiday | |||
8 | 21/10 | Recommender systems/Matrix Factorization 07-mf.pdf | Ch. 9 of MMDS | ||
9 | 28/10 | Massive Link Analysis 08-link.pdf | Ch. 5 of MMDS | ||
10 | 4/11 | Mid-term | |||
11 | 11/11 | Analysis of Massive Graph 09-graph.pdf | Ch. 10 of MMDS | ||
12 | 18/11 | Large Scale SVM 10-svm.pdf | | SVM tutorial | |
13 | 25/11 | Online Learning 11-ol.pdf | Online learning survey |
Time | Venue | Notes | |
---|---|---|---|
Midterm Examination | Nov. 4, 9:30am-12:00 noon | TBA | TBA |
Final Examination | TBA | TBA | TBA |
Homework Assignments | Mid-term Examination | Project |
---|---|---|
20% | 30% | 50% |