Research Interests
- Speech and audio signal processing:
speech analysis, pitch estimation, speech enhancement/separation
- Spoken language technology: automatic
speech/speaker/language recognition, text-to-speech
- Deep learning for speech: acoustic
model adaptation, deep factorization, information disentanglement,
- Speech and healthcare: hearing
assistive devices, assessment of speech/language disorders
- Paralinguistics in speech:
emotion, expressiveness, speaking style, attitude, empathy
- Cantonese: linguistic properties,
speech and language corpora, code-mixing, written vs. spoken
Knowledge and Technology Transfer
Funded Projects (recent)
- Objective assessment of physical competence
and wellness based on voice and speech analytics, HK$602,349 by RGC
General Research Fund, 2021 - 2023.
- Quantifying effectiveness of psychotherapy
with deep learning based speech analytics, by Sustainable Research
Fund of CUHK Research Committee, 2019 - 2021.
- Personalized Storytelling System Based on
Expressive Voice Creation by Deep Learning, HK$1,361,600 by
Innovation and Technology Fund (Tier 3 scheme), 2019 - 2020.
- Unsupervised Speech Modeling for Low-Resource
Languages, HK$ 675.647 by RGC General Research Fund, 2017 - 2019
- Development of Computer-based Tools for
Clinical Assessment of Speech, Hearing and Language Disabilities,
HK$ 1,384,000 by Innovation and Technology Fund (Tier 3 scheme), 2015
- 2016.
- Objective Assessment of Pathological Voices
based on Acoustic Signal Analysis and Classification, HK$ 500,000 by
RGC General Research Fund, 2015 - 2017.
- Periodicity Enhancement and Phonemic
Restoration for Improving Speech Perception by Hearing Impaired
Listeners, HK$ 922,000 by RGC General Research Fund, 2012 - 2014.
Selected Recent Publications
- Matthew King-Hang Ma, Manson Cheuk-Man Fong,
Chenwei Xie, Tan Lee, Guanrong Chen and William Shiyuan Wang,
"Regularity and randomness in ageing: Differences in resting-state
EEG complexity measured by largest Lyapunov exponent," in
Neuroimage: Reports, December 2021.
- Daxin Tan and Tan Lee, "Fine-grained style
modeling, transfer and prediction in text-to-speech synthesis via
phone-level content-style disentanglement," Proceedings of
INTERSPEECH 2021, pp.4683-4687, Brno, August 2021.
- Xurong Xie, Xunying Liu, Tan Lee and Lan Wang,
"Bayesian learning for deep neural network adaptation," in IEEE/ACM
Transactions on Audio, Speech, and Language Processing, vol. 29, pp.
2096-2110, 2021.
- Guangyan Zhang, Ying Qin and Tan Lee,
"Learning syllable-level discrete prosodic representation for
expressive speech generation," Proceedings of INTERSPEECH 2020,
Shanghai, October 2020.
- Jingyu Li and Tan Lee, "Text-independent
speaker verification with dual attention network," Proceedings of
INTERSPEECH 2020, Shanghai, October 2020.
- Si Ioi Ng, Cymie Wing-Yee Ng, Jiarui Wang, Tan
Lee, Kathy Yuet-Sheung Lee and Michael Chi-Fai Tong, "CUCHILD: A
large-scale Cantonese corpus of child speech for phonology and
articulation assessment," Proceedings of INTERSPEECH 2020, Shanghai,
October 2020.
- Yuzhong Wu and Tan Lee, "Time-frequency
feature decomposition based on sound duration for acoustic scene
classification," Proceedings of ICASSP 2020, pp.716-720, Virtual
Barcelona, Spain, May 2020.
- Zhiyuan Peng, Siyuan Feng and Tan Lee,
"Mixture factorized auto-encoder for unsupervised hierarchical deep
factorization of speech signal," Proceedings of ICASSP 2020,
pp.6769-6773, Virtual Barcelona, Spain, May 2020.
- Ying Qin, Tan Lee and Anthony P. H. Kong,
"Automatic assessment of speech impairment in Cantonese-speaking
people with aphasia," IEEE Journal of Selected Topics in Signal
Processing, Vol.14, No.2, pp.331-345, February 2020.
- Siyuan Feng and Tan Lee, "Exploiting
cross-lingual speaker and phonetic diversity for unsupervised
subword modeling," IEEE/ACM Transactions on Audio, Speech and
Language Processing, Vol.27, No.12, pp.2000-2011, December 2019.
- Yuanyuan Liu, Tan Lee, Thomas K.T. Law and
Kathy Y.S. Lee, "Acoustical assessment of voice disorder with
continuous speech using ASR posterior features," IEEE/ACM
Transactions on Audio, Speech, Language and Processing, Vol.27,
No.6, pp.1047-1059, June 2019.
- Hansjoerg Mixdorff, Angelika Hoenemann, Albert
Rilliard, Tan Lee and Matthew K.H. Ma, "Audio-visual expressions of
attitude: How many different attitudes can perceivers decode ?"
Speech Communication, Vol.95, pp.114-126, December 2017.