I am a master student at College of Computer Science and Technology, Zhejiang University, majoring in Computer Science.

Currently I work on the Audio Research Team at Zhejiang University, under the supervision of Prof. Zhou Zhao. Previously I graduated from Turing Class, a program established by Chu Kochen Honors College, with a bachelor’s degree in Artificial Intelligence.

My research interests primarily focus on Multi-Modal Generative AI, specifically in Spatial Audio, Music, Singing, and Speech. I have published papers at top international AI conference, including NeurIPS, ACL, AAAI and EMNLP. Currently, I am working on Spatial Audio Generation and Immersive Audio Synthesis.

I am actively looking for academic collaboration and research intern, feel free to contact me via email at panch@zju.edu.cn.

🔥 News

  • 2025.05 🎉 2 papers(MESA & ISDrama) are accepted by ACM-MM!
  • 2025.05 🎉 2 papers are accepted by ACL(Findings)!
  • 2024.12 1 paper is accepted by AAAI!
  • 2024.10 🎉 I am awarded Chu Kochen Scholarship!
  • 2024.09 🎉 GTSinger is accepted by NeurIPS 2024(spotlight) and 1 paper is accepted by EMNLP 2024!

📝 Publications

# denotes co-first authors

🎙 Singing Voice Synthesis

NeurIPS 2024(Spotlight)
sym

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
Yu Zhang, Changhao Pan#, Wenxiang Guo#, et al.

Hugging Face Demo

  • GTSinger is a large Global, multi-Technique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks.
  • Our work is promoted by multiple media and forums, such as weixin, weixin, and zhihu.
ACL 2025(Findings)
sym

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation
Wenxiang Guo#, Yu Zhang#, Changhao Pan#, et al.

Project

  • STARS is a unified framework for singing transcription, alignment, and refined style annotation based on hierarchical representation learning.
ACL 2025(Findings)
sym

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Yu Zhang#, Wenxiang Guo#, Changhao Pan#, et al.

Project |

  • TCSinger 2 is a multi-task multilingual zero-shot SVS model with style transfer and style control based on various prompts.

👂 Spatial Audio

ACM-MM 2025
sym

A Multimodal Evaluation Framework for Spatial Audio Playback Systems: From Localization to Listener Preference
Changhao Pan#, Wenxiang Guo, Yu Zhang, et al.

Hugging Face Project

  • PSA-MOS provides 50 hours of high-quality spatial audio recordings, with detailed localization annotations and fine-grained MOS ratings.
  • MESA is a multimodal evaluation framework for spatial audio playback systems which exhibits strong correlation with human perceptual assessments.
ACM-MM 2025
sym

ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting
Yu Zhang#, Wenxiang Guo#, Changhao Pan#, et al.

Hugging Face Project

  • MRSDrama is the first multimodal recorded spatial drama dataset, containing binaural drama audios, scripts, videos, geometric poses, and textual prompts.
  • ISDrama is the first immersive spatial drama generation model through multimodal prompting.
Submitted to NeurIPS 2025
sym

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
Wenxiang Guo#, Changhao Pan#, Zhiyuan Zhu#, Xintong Hu#, et al.

Hugging Face Demo

  • The largest recorded spatial audio dataset contains four scenarios: daily life, singing, music, and speech, with a total duration of 500 hours.
  • Supports multiple spatial audio tasks: audio spatialization, spatial TTA, acoustic event localization and detection(SELD), etc.

🎼 Music Generation

Preprint
sym

Versatile Framework for Song Generation with Prompt-based Control
Yu Zhang#, Wenxiang Guo#, Changhao Pan#, et al.

Project

  • VersBand is a multi-task song generation framework for synthesizing high-quality, aligned songs with prompt-based control.

Others

🎖 Honors and Awards

  • Chu Kochen Scholarship(as undergraduate), 2024
    • Highest scholarship at Zhejiang University
  • Chinese National Scholarship, 2022, 2023, 2024
    • Awarded by Ministry of Education of China; Top 1%; 3 consecutive times
  • CCF Outstanding Undergraduate Students, 2024
    • Awarded to 100 undergraduates in the field of Computer Scicence
  • Top-10 Outstanding Undergraduate Students of the College of Computer Science and Technology, 2024
  • Top-10 Outstanding Students of the Chu Kozhen Honors College, 2023
  • BaoGang Elite Scholarship, 2023

📖 Educations

  • 2025.09 - 2028.06(Expected), Master, College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang
    • Major: Computer Science
  • 2021.09 - 2025.06, Undergraduate, Chu Kochen Honors College & College of Computer Science and Technology, Zhejiang Univeristy, Hangzhou, Zhejiang
    • Major: Artificial Intelligence (Turing Honor Program)
    • GPA: 4.85/5.0, 94.04/100, Rank: 1/79
  • 2018.09 - 2021.06, Yuying Experimental School, Wenzhou, Zhejiang

💻 Research & Internships

📚 Academic Service

  • Conference Reviewer: NeurIPS 2025; ACL 2025
  • Assist to Review: NeurIPS 2024; CVPR 2025; ACM-MM 2025; EMNLP 2025; TMM