Prof. WU, Xixin 吳 錫 欣 教授
Assistant Professor
BS (Beihang University)
MS (Tsinghua University)
PhD (The Chinese University of Hong Kong)
Research Interests :
*Generative Artificial Intelligence
*Speech and Language Processing for Health
*Affective Computing
*Human-machine Interaction
Office: Room 709B, William M.W. Mong Engineering Building
Tel: (852) 3943-8243
Email: wuxx@se.cuhk.edu.hk
Biography
Xixin Wu received his BS, MS and PhD degrees from Beihang University, Tsinghua University and The Chinese University of Hong Kong, respectively. He is currently an Assistant Professor in the Department of Systems Engineering and Engineering Management, CUHK. Before this, he worked as a Research Associate with the Machine Intelligence Laboratory, Engineering Department of Cambridge University, and a Research Assistant Professor at the Stanley Ho Big Data Decision Analytics Research Centre, CUHK. His research interests include generative artificial intelligence, speech and language technologies, affective computing, and human-machine interaction.
Selected Publications
Estimating the Uncertainty in Emotion Class Labels With Utterance-Specific Dirichlet Priors, Wen Wu, Chao Zhang, Xixin Wu, Philip C. Woodland, IEEE Transactions on Affective Computing, 2022
Exemplar-based Emotive Speech Synthesis, Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Helen Meng, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 29, 2021
Speech Emoftion Recognition Using Sequential Capsule Networks, Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 29, 2021
Any-to-Many Voice Conversion With Location-Relative Sequence-to-Sequence Modeling, Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu*, Xunying Liu, Helen Meng, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 29, 2021 (*corresponding author)
Intonation classification for L2 English speech using multi-distribution deep neural networks, Kun Li, Xixin Wu and Helen Meng, Computer Speech & Language, 2016
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition, Jinchao Li, Xixin Wu*, Kaitao Song, Dongsheng Li, Xunying Liu, Helen Meng, in Proc. ICASSP’23, (ACII 2022 Affective Vocal Burst (AV-B) Recognition competition 1st place, *corresponding author)
Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks, Jingbei Li, Yi Meng, Xixin Wu*, Zhiyong Wu*, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang, in Proc. ACM MM’22, (*corresponding author)
Exploring linguistic feature and model combination for speech recognition based automatic AD detection, Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng, in Proc. Interspeech’22
Neural Architecture Search for Speech Emotion Recognition, Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng, in Proc. ICASSP’22
Ensemble Approaches for Uncertainty in Spoken Language Assessment, Xixin Wu, Kate M. Knill, Mark J.F. Gales, Andrey Malinin, in Proc. Interspeech’20
Speech Emotion Recognition Using Capsule Networks, Xixin Wu, Songxiang Liu, Yuewen Cao, Xu Li, Jianwei Yu, Dongyang Dai, Xi Ma, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng, in Proc. ICASSP’19
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis, [demo], Xixin Wu, Yuewen Cao, Mu Wang, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu and Helen Meng, in Proc. Interspeech’18
Feature based Adaptation for Speaking Style Synthesis, Xixin Wu, Lifa Sun, Shiyin Kang, Songxiang Liu, Zhiyong Wu, Xunying Liu, Helen Meng, in Proc. ICASSP’18
Awards and Grants
First place in two tasks of ACII 2022 Affective Vocal Bursts (AV-B) Recognition Competition
First place in ACL 2022 Doc2Dial Shared Task
Best paper award, IEEE Robio 2022
Champion, HKSTP SciTech Challenge 2021
RGC General Research Fund: “Older Adult-Facing, Personalized Text-to-Speech Synthesis with Automatic Textual and Prosodic Enhancements for Perceptual Clarity”, 2024-2027, PI.