I am a master student at College of Computer Science and Technology, Zhejiang University, majoring in Computer Science.
Currently I work on the Audio Research Team at Zhejiang University, under the supervision of Prof. Zhou Zhao. Previously I graduated from Turing Class, a program established by Chu Kochen Honors College, with a bachelor’s degree in Artificial Intelligence.
My research interests primarily focus on Multi-Modal Generative AI, specifically in Spatial Audio, Music, Singing, and Speech. I have published papers at top international AI conference, including NeurIPS, ACL, AAAI and EMNLP. Currently, I am working on Spatial Audio Generation and Immersive Audio Synthesis.
I am actively looking for academic collaboration and research intern, feel free to contact me via email at panch@zju.edu.cn.
🔥 News
- 2025.05 🎉 2 papers(MESA & ISDrama) are accepted by ACM-MM!
- 2025.05 🎉 2 papers are accepted by ACL(Findings)!
- 2024.12 1 paper is accepted by AAAI!
- 2024.10 🎉 I am awarded Chu Kochen Scholarship!
- 2024.09 🎉 GTSinger is accepted by NeurIPS 2024(spotlight) and 1 paper is accepted by EMNLP 2024!
📝 Publications
# denotes co-first authors
🎙 Singing Voice Synthesis

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
Yu Zhang, Changhao Pan#, Wenxiang Guo#, et al.

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation
Wenxiang Guo#, Yu Zhang#, Changhao Pan#, et al.
- STARS is a unified framework for singing transcription, alignment, and refined style annotation based on hierarchical representation learning.

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Yu Zhang#, Wenxiang Guo#, Changhao Pan#, et al.
Project |
- TCSinger 2 is a multi-task multilingual zero-shot SVS model with style transfer and style control based on various prompts.
-
EMNLP-2024
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control, Yu Zhang, Ziyue Jiang, Ruiqi Li, Changhao Pan, Jinzheng He, Rongjie Huang, Chuxin Wang, Zhou Zhao. -
AAAI-2025
TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching, Wenxiang Guo, Yu Zhang, Changhao Pan, et. al.
👂 Spatial Audio

A Multimodal Evaluation Framework for Spatial Audio Playback Systems: From Localization to Listener Preference
Changhao Pan#, Wenxiang Guo, Yu Zhang, et al.
- PSA-MOS provides 50 hours of high-quality spatial audio recordings, with detailed localization annotations and fine-grained MOS ratings.
- MESA is a multimodal evaluation framework for spatial audio playback systems which exhibits strong correlation with human perceptual assessments.

ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting
Yu Zhang#, Wenxiang Guo#, Changhao Pan#, et al.
- MRSDrama is the first multimodal recorded spatial drama dataset, containing binaural drama audios, scripts, videos, geometric poses, and textual prompts.
- ISDrama is the first immersive spatial drama generation model through multimodal prompting.

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
Wenxiang Guo#, Changhao Pan#, Zhiyuan Zhu#, Xintong Hu#, et al.
- The largest recorded spatial audio dataset contains four scenarios: daily life, singing, music, and speech, with a total duration of 500 hours.
- Supports multiple spatial audio tasks: audio spatialization, spatial TTA, acoustic event localization and detection(SELD), etc.
🎼 Music Generation

Versatile Framework for Song Generation with Prompt-based Control
Yu Zhang#, Wenxiang Guo#, Changhao Pan#, et al.
- VersBand is a multi-task song generation framework for synthesizing high-quality, aligned songs with prompt-based control.
Others
IEEE-TVCG
Interactive Table Synthesis with Natural Language, Yanwei Huang, Yunfan Zhou, Ran Chen, Changhao Pan, Xinhuan Shu, Di Weng, Yingcai Wu.
🎖 Honors and Awards
- Chu Kochen Scholarship(as undergraduate), 2024
- Highest scholarship at Zhejiang University
- Chinese National Scholarship, 2022, 2023, 2024
- Awarded by Ministry of Education of China; Top 1%; 3 consecutive times
- CCF Outstanding Undergraduate Students, 2024
- Awarded to 100 undergraduates in the field of Computer Scicence
- Top-10 Outstanding Undergraduate Students of the College of Computer Science and Technology, 2024
- Top-10 Outstanding Students of the Chu Kozhen Honors College, 2023
- BaoGang Elite Scholarship, 2023
📖 Educations
- 2025.09 - 2028.06(Expected), Master, College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang
- Major: Computer Science
- 2021.09 - 2025.06, Undergraduate, Chu Kochen Honors College & College of Computer Science and Technology, Zhejiang Univeristy, Hangzhou, Zhejiang
- Major: Artificial Intelligence (Turing Honor Program)
- GPA: 4.85/5.0, 94.04/100, Rank: 1/79
- 2018.09 - 2021.06, Yuying Experimental School, Wenzhou, Zhejiang
💻 Research & Internships
- 2023.01-2023.09 Research Assisant in State Key Lab of CAD&CG at Zhejiang University
. Advisor: Prof. Yingcai Wu (巫英才).
- 2023.09-2024.06 Research Assisant in Audio Research Team at Zhejiang University
. Advisor: Prof. Zhou Zhao (赵洲).
📚 Academic Service
- Conference Reviewer: NeurIPS 2025; ACL 2025
- Assist to Review: NeurIPS 2024; CVPR 2025; ACM-MM 2025; EMNLP 2025; TMM