| Course code | CSCI5120 | 
| Course title | Advanced Topics in Database Systems 數據庫系統高級專題 | 
| Course description | This course will introduce to students advanced topics in database systems including advanced data structures, concurrency control, deadlock resolutions, recovery schemes, distributed database systems, multimedia database indexing techniques, and data mining, data on the web and network data analysis. 本科介紹數據庫系統之高級專題。內容包括:高級數據結構、並行控制、鎖死解法、復原方案、分佈式數據庫系統、多媒體數據庫索引方法及數據採集、網絡上的數據及網絡數據分析。 | 
| Unit(s) | 3 | 
| Course level | Postgraduate | 
| Prerequisite | CSCI3170 | 
| Exclusion | CMSC5705 or SEEM5010 | 
| Semester | 1 or 2 | 
| Grading basis | Graded | 
| Grade Descriptors | A/A-:  EXCELLENT – exceptionally good performance and far exceeding expectation in all or most of the course learning outcomes; demonstration of superior understanding of the subject matter, the ability to analyze problems and apply extensive knowledge, and skillful use of concepts and materials to derive proper solutions. B+/B/B-: GOOD – good performance in all course learning outcomes and exceeding expectation in some of them; demonstration of good understanding of the subject matter and the ability to use proper concepts and materials to solve most of the problems encountered. C+/C/C-: FAIR – adequate performance and meeting expectation in all course learning outcomes; demonstration of adequate understanding of the subject matter and the ability to solve simple problems. D+/D: MARGINAL – performance barely meets the expectation in the essential course learning outcomes; demonstration of partial understanding of the subject matter and the ability to solve simple problems. F: FAILURE – performance does not meet the expectation in the essential course learning outcomes; demonstration of serious deficiencies and the need to retake the course. | 
| Learning outcomes | At the end of the course of studies, students will have acquired the ability to 1. learn about some advanced topics in database systems. They include some classical topics such as distributed database systems, concurrency control in transaction management, and data replication in a distributed environment, and some special datatypes such as multimedia databases, where indexing mechanisms are studied. We also cover important topics of data mining and data on the web. We would revise the content and include new important topics from time to time. We have added the study of massive network data, and introduced algorithms for handling and analysis of such data. We also have included the road network data querying which has become very useful in our daily life. 2. study research papers related to the topics covered. 3. summarize the major ideas from a study of a topic and give a presentation on the topic 4. possibly come up with innovative ideas and touch on the research of related topics. | 
| Assessment (for reference only) | Essay test or exam:40% Essays:20% Presentation:20% Others:20% | 
| Recommended Reading List | 1. R. Agrawal, R. Srikant, Fast Algorithms for Mining Association Rules, Proceedings of the 20th VLDB Conference, 1994. (ps) (pdf) 2. J. Han, J. Pei, Y. Yin, Mining Frequent Patterns without Candidate Generation, SIGMOD 2000. 3. R. Agrawal, J. Gehrke, D. Gunopulos, P Raghavan, Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications , Proceedings of the ACM SIGMOD, 1998 4. T. Chiueh, Content-Based Image Indexing VLDB 1994. (Skip Section 3.2 on Optimization) 5. Guttman, R-tree: A Dynamic Index Structure for Spatial Searching ACM SIGMOD, 1984. 6. N. Roussopoulos et al, Nearest Neighbor Queries, ACM SIGMOD, 1995. 7. K. Mehlhor et al., External-Memory Breadth-First Search with Sublinear I/O ? ESA 2002. 8. K. Munagala et al., I/O Complexities of Graph Algorithms SODA 1999. 9. S. Chu et al., Triangle Listing in Massive Networks and Its Applications KDD 2011. 10. J. Cheng et al., Efficient Core Decomposition in Massive Networks ICDE 2011. 11. S. J. van Schaik et al., A Memory Efficient Reachability Data Structure Through Bit Vector Compression SIGMOD 2011. 12. Abraham et al., A Hub-Based Labeling Algorithm for Shortest Paths on Road Networks SEA 2011. 13. Abraham et al., VC-Dimension and Shortest Path Algorithms ICALP 2011. 14. A.V. Goldberg et al., Computing the Shortest Path: A∗ Search Meets Graph Theory SODA 2005. 15. R. Geisberger et al., Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks WEA 2008. 16. A.V. Goldberg et al., Reach for A∗: Efficient Point-to-Point Shortest Path Algorithms ALENEX 2006. 17. E. Cohen et al., Reachability and distance queries via 2-hop labels SODA 2002. 18. I. Abraham et al., Highway Dimension, Shortest Paths, and Provably Efficient Algorithms SODA 2010 | 
| CSCIN programme learning outcomes | Course mapping | 
| Upon completion of their studies, students will be able to: | |
| 1. identify, formulate, and solve computer science problems (K/S); | TP | 
| 2. design, implement, test, and evaluate a computer system, component, or algorithm to meet desired needs (K/S); | T | 
| 3. receive the broad education necessary to understand the impact of computer science solutions in a global and societal context (K/V); | TP | 
| 4. communicate effectively (S/V); | TP | 
| 5. succeed in research or industry related to computer science (K/S/V); | T | 
| 6. have solid knowledge in computer science and engineering, including programming and languages, algorithms, theory, databases, etc. (K/S); | TP | 
| 7. integrate well into and contribute to the local society and the global community related to computer science (K/S/V); | |
| 8. practise high standard of professional ethics (V); | |
| 9. draw on and integrate knowledge from many related areas (K/S/V); | TP | 
| Remarks: K = Knowledge outcomes; S = Skills outcomes; V = Values and attitude outcomes; T = Teach; P = Practice; M = Measured | |