Chunhui Zhang chunhuizng

Hi there 👋

I'm Chunhui Zhang, a Ph.D. candidate in Computer Science at Dartmouth 🌲, working with 🌟Professor Soroush Vosoughi. I also hold an MSCS degree (research-based) from Brandeis University, where I was honored with the GSAS Fellowship, and a Bachelor's degree in CS from Northeastern University, receiving the Outstanding Honor Thesis Award.

🔭 Research

My research focuses on advancing the intrinsic properties of deep learning across diverse modalities, with an emphasis on trustworthiness, scalability, and applicability to real-world challenges. Highlights of my work include:

Overcoming Multi-step Complexity in Theory-of-Mind Reasoning: A Scalable Bayesian Planner
Conference: ICML 2025, Spotlight (Top 2.59%).
Authors: Chunhui Zhang, Zhongyu Ouyang, Kwonjoon Lee, Nakul Agarwal, Sean Dae Houlihan, {Soroush Vosoughi, Shao-Yuan Lo}
Growing Through Experience: Scaling Episodic Grounding in Language Models
Conference: ACL 2025, Oral Presentation (Top 3.24%).
Authors: Chunhui Zhang, Sirui Wang, Zhongyu Ouyang, Xiangchi Yuan, Soroush Vosoughi
Pretrained Image-Text Models are Secretly Video Captioners
Conference: NAACL 2025 Oral Presentation (Top 2.88%).
Authors: Chunhui Zhang*, Yiren Jian*, Zhongyu Ouyang, Soroush Vosoughi
Knowing More, Acting Better: Hierarchical Representation for Embodied Decision-Making for PPO Training
Conference: Findings of EMNLP 2025
Authors: Chunhui Zhang, Zhongyu Ouyang, Xingjian Diao, Zheyuan Liu, Soroush Vosoughi
Superficial Self-Improved Reasoners Benefit from Model Merging
Conference: EMNLP 2025
Authors: Xiangchi Yuan, Chunhui Zhang, Zheyuan Liu, Dachuan Shi, Soroush Vosoughi, Wenke Lee
Temporal Working Memory: Query-Guided Temporal Segment Refinement for Enhanced Multimodal Understanding
Conference: Findings of NAACL 2025
Authors: {Chunhui Zhang*, Xingjian Diao*}, Weiyi Wu, Zhongyu Ouyang, Peijun Qing, Ming Cheng, Soroush Vosoughi, Jiang Gui
Working Memory Identifies Reasoning Limits in Language Models
Conference: EMNLP 2024
Authors: Chunhui Zhang, Yiren Jian, Zhongyu Ouyang, Soroush Vosoughi

💼 Internship Experience

Amazon Science (Sept 2025 – Present)
Applied Scientist Intern, Seattle, WA
Research on reinforcement learning post-training for GUI-based agents, focusing on adaptive reasoning and long-horizon memory in multimodal environments.
Google DeepMind (Jun 2025 – Sept 2025)
Research Intern, Mountain View, CA
Developed a high-throughput RL training pipeline for the Gemma-3n family, achieving 5× speedup via KV-cache reuse in audio–text long-context scenarios. Contributed to multimodal Gemma-3n open-source tools.
Honda Research Institute USA (May 2024 – Sept 2024)
Research Intern, San Jose, CA
Worked on multimodal long-context modeling with context-parallel and ring-attention architectures, improving alignment across audio, video, and text modalities for real-world perception tasks.

🌱 Current Focus

I am currently exploring Multimodal LLMs (Language-Vision-Audio), memory mechanisms, and reinforcement learning to discover unseen and genuinely new patterns in the real world. My recent work includes training recipes for large-scale models, which ranked Top-2 on PaperWithCode’s Video Captioning Leaderboard, showcasing optimal strategies for resource allocation in post-training.

📫 Reach Me

Email: [email protected]
LinkedIn: Chunhui Zhang
GitHub: chunhuizng
Google Scholar: My Publications

💬 Let's Connect

Feel free to reach out if you're interested in collaboration, career advice, or just a friendly chat about research and life!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunhui Zhang chunhuizng

Highlights

Block or report chunhuizng