Research Interests

Speech and audio signal processing: speech analysis, pitch estimation, speech enhancement/separation
Spoken language technology: automatic speech/speaker/language recognition, text-to-speech
Deep learning for speech: acoustic model adaptation, deep factorization, information disentanglement,
Speech and healthcare: hearing assistive devices, assessment of speech/language disorders
Paralinguistics in speech: emotion, expressiveness, speaking style, attitude, empathy
Cantonese: linguistic properties, speech and language corpora, code-mixing, written vs. spoken

Knowledge and Technology Transfer

Personalized speaking aids
Spoken language technologies: CUCorpora, CUCall, CURSBB, CUTalk, CU2C, CUMIX, CUProsody.
Personalized hearing devices: ACEHearing, AumeoAudio, Heari
Computerized version of Cantonese basic speech perception test (CBSPT)
CANtonese DIsyllabic Word Identification Test in Noise - Adaptive [CANDIWIT-N-A],

Funded Projects (recent)

Objective assessment of physical competence and wellness based on voice and speech analytics, HK$602,349 by RGC General Research Fund, 2021 - 2023.
Quantifying effectiveness of psychotherapy with deep learning based speech analytics, by Sustainable Research Fund of CUHK Research Committee, 2019 - 2021.
Personalized Storytelling System Based on Expressive Voice Creation by Deep Learning, HK$1,361,600 by Innovation and Technology Fund (Tier 3 scheme), 2019 - 2020.
Unsupervised Speech Modeling for Low-Resource Languages, HK$ 675.647 by RGC General Research Fund, 2017 - 2019
Development of Computer-based Tools for Clinical Assessment of Speech, Hearing and Language Disabilities, HK$ 1,384,000 by Innovation and Technology Fund (Tier 3 scheme), 2015 - 2016.
Objective Assessment of Pathological Voices based on Acoustic Signal Analysis and Classification, HK$ 500,000 by RGC General Research Fund, 2015 - 2017.
Periodicity Enhancement and Phonemic Restoration for Improving Speech Perception by Hearing Impaired Listeners, HK$ 922,000 by RGC General Research Fund, 2012 - 2014.

Selected Recent Publications

Matthew King-Hang Ma, Manson Cheuk-Man Fong, Chenwei Xie, Tan Lee, Guanrong Chen and William Shiyuan Wang, "Regularity and randomness in ageing: Differences in resting-state EEG complexity measured by largest Lyapunov exponent," in Neuroimage: Reports, December 2021.
Daxin Tan and Tan Lee, "Fine-grained style modeling, transfer and prediction in text-to-speech synthesis via phone-level content-style disentanglement," Proceedings of INTERSPEECH 2021, pp.4683-4687, Brno, August 2021.
Xurong Xie, Xunying Liu, Tan Lee and Lan Wang, "Bayesian learning for deep neural network adaptation," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2096-2110, 2021.
Guangyan Zhang, Ying Qin and Tan Lee, "Learning syllable-level discrete prosodic representation for expressive speech generation," Proceedings of INTERSPEECH 2020, Shanghai, October 2020.
Jingyu Li and Tan Lee, "Text-independent speaker verification with dual attention network," Proceedings of INTERSPEECH 2020, Shanghai, October 2020.
Si Ioi Ng, Cymie Wing-Yee Ng, Jiarui Wang, Tan Lee, Kathy Yuet-Sheung Lee and Michael Chi-Fai Tong, "CUCHILD: A large-scale Cantonese corpus of child speech for phonology and articulation assessment," Proceedings of INTERSPEECH 2020, Shanghai, October 2020.
Yuzhong Wu and Tan Lee, "Time-frequency feature decomposition based on sound duration for acoustic scene classification," Proceedings of ICASSP 2020, pp.716-720, Virtual Barcelona, Spain, May 2020.
Zhiyuan Peng, Siyuan Feng and Tan Lee, "Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal," Proceedings of ICASSP 2020, pp.6769-6773, Virtual Barcelona, Spain, May 2020.
Ying Qin, Tan Lee and Anthony P. H. Kong, "Automatic assessment of speech impairment in Cantonese-speaking people with aphasia," IEEE Journal of Selected Topics in Signal Processing, Vol.14, No.2, pp.331-345, February 2020.
Siyuan Feng and Tan Lee, "Exploiting cross-lingual speaker and phonetic diversity for unsupervised subword modeling," IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol.27, No.12, pp.2000-2011, December 2019.
Yuanyuan Liu, Tan Lee, Thomas K.T. Law and Kathy Y.S. Lee, "Acoustical assessment of voice disorder with continuous speech using ASR posterior features," IEEE/ACM Transactions on Audio, Speech, Language and Processing, Vol.27, No.6, pp.1047-1059, June 2019.
Hansjoerg Mixdorff, Angelika Hoenemann, Albert Rilliard, Tan Lee and Matthew K.H. Ma, "Audio-visual expressions of attitude: How many different attitudes can perceivers decode ?" Speech Communication, Vol.95, pp.114-126, December 2017.