![]() |
Mengqi Huang
Ph.D.
University of Science and Technology of China
443 Huangshan Road, Hefei, China 230027
Email: huangmq@mail.ustc.edu.cn
|
I received my Ph.D. degree from the University of Science and Technology of China (USTC) in 2025. My research interests include deep generative models, image/video generation, and unified multimodal generation. I am the recipient of the Best Student Paper Award at ACM Multimedia 2022 as the first author. I have also received funding from the First National Natural Science Foundation of China for Youth Student Fundamental Research (Ph.D. student).
2025年于中国科学技术大学获博士学位。博士期间以第一作者获CCF-A类国际会议ACM Multimedia 2022最佳学生论文奖。获批首届国家自然科学基金青年学生基础研究项目(博士研究生),入选首届中国科协青年人才托举工程博士生专项(中国电子学会托举)。获2025年中国科学院院长特别奖。毕业曾入选阿里星等头部人才计划。
Education
![]() |
University of Science and Technology of China
Ph.D. in Cyber Security September 2020 - June 2025, Hefei & Beijing Advisor: Prof. Zhendong Mao |
![]() |
University of Science and Technology of China
B.Eng in Automation September 2016 - June 2020, Hefei Advisor: Prof. Zhendong Mao |
Funding
Selected Paper Publications [Google Scholar]
In the Year of 2025:
![]() |
A4A: Adapter for Adapter Transfer via All-for-All Mapping for Cross-Architecture Models
Keyu Tu, Mengqi Huang, Zhuowei Chen, Zhendong Mao CVPR 2025 , CCF-A |
![]() |
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia, Mengqi Huang, Nan Chen, Lei Zhang, Zhendong Mao CVPR 2025 , CCF-A |
![]() |
Dragin3D: Image Editing by Dragging in 3D Space
Weiran Guang, Xiaoguang Gu, Mengqi Huang, Zhendong Mao CVPR 2025 , CCF-A |
![]() |
FeedEdit: Text-Based Image Editing with Dynamic Feedback Regulation
Fengyi Fu, Lei Zhang, Mengqi Huang, Zhendong Mao CVPR 2025 , CCF-A |
![]() |
CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization
Nan Chen, Mengqi Huang, Zhuowei Chen, Yang Zheng, Lei Zhang, Zhendong Mao AAAI 2025 , CCF-A |
In the Year of 2024:
![]() |
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
Mengqi Huang, Zhendong Mao, Mingcong Liu, Qian He, Yongdong Zhang CVPR 2024 , CCF-A [Github] |
![]() |
Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing
Hao Li, Mengqi Huang, Lei Zhang, Bo Hu, Yi Liu, Zhendong Mao AAAI 2024 , CCF-A |
![]() |
DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation
Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao AAAI 2024 , CCF-A, [Project Page] |
In the Year of 2023 & 2022:
![]() |
Towards Accurate Image Coding: Improved Autoregressive Image Generation With Dynamic Vector Quantization
(Highlight, 2.6% of submitted papers)
Mengqi Huang, Zhendong Mao, Zhuowei Chen, Yongdong Zhang CVPR 2023 , CCF-A [Github] [Video] |
![]() |
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation
Mengqi Huang, Zhendong Mao, Quan Wang, Yongdong Zhang CVPR 2023 , CCF-A [Github] [Video] |
![]() |
DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation
(Best Student Paper Award, 1/3009 of submitted papers)
Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang ACM Multimedia 2022 , CCF-A ![]() |
Awards
- 2025年中国科学院院长特别奖
- Best Student Paper Award, ACM Multimedia 2022 (First Author)
Internships
![]() |
ByteDance Inc.
Research Intern, Intelligent Creation Department. Beijing July 2023 - Now ![]()
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
We present RealCustom to disentangle subject similarity from text controllability and thereby allows both to be optimized simultaneously without conflicts. The core idea of RealCustom is to represent given subjects as real words that can be seamlessly integrated with given texts, and further leveraging the relevance between real words and image regions to disentangle visual condition from text condition.
![]()
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
We propose a highly-consistent data synthesis pipeline for single-subject and multi-subject driven generation. This pipeline harnesses the intrinsic in-context generation capabilities of diffusion transformers and generates high-consistency multi-subject paired data. Moreover, we introduce UNO, which consists of progressive cross-modal alignment and universal rotary position embedding.
|
![]() |
Kuaishou Technology
Research Intern, Search Technology Department. Beijing March 2022 - Novemeber 2022 |
![]() |
McMaster University
Research Intern, Computing & Software School (Department) Hamilton, Canada June 2019 - September 2019 |
Competitions
第二届粤港澳大湾区国际算法算例大赛-高效可控的文生图方法 Team Leader, Second Prize. August 2023 - Novemeber 2023 |
首届兴智杯全国人工智能应用创新大赛-多模态技术创新赛-基于文本的图像生成 Team Leader, Second Prize. August 2022 - Novemeber 2022 |
ACM Multimedia 2020 Social Media Prediction Challenge Team Leader, Top Performance Award. [Github] March 2020 - June 2020 |
ICIP 2020 Image Popularity Prediction Challenge Team Leader, Reach Fourth Grade. March 2020 - June 2020 |

Last update: June, 2025. Webpage template borrows from Weinan Zhang.
News
February 2025
4 papers are accepted by CVPR 2025!
December 2024
2 papers are accepted by AAAI 2025!
February 2024
1 papers is accepted by CVPR 2024!
December 2023
2 papers are accepted by AAAI 2024!
March 2023
Our paper "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization" is selected as a highlight at CVPR 2023!
February 2023
2 papers are accepted by CVPR 2023!
October 2022
Our paper "DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation" receives the Best Student Paper Award at ACM Multimedia 2022!
June 2022
1 paper is accepted by ACM Multimedia 2022!