avatar

Chaohui Yu (于超辉)

DAMO Academy, Alibaba Group

About Me

I'm an algorithm engineer at DAMO Academy, Alibaba Group. Before this, I got my Master degree and Bachelor degree from Institute of Computing Technology (ICT) and Shandong University in 2020 and 2017, respectively. My research interest includes: Transfer Learning, Object Detection/Segmentation, Semi/Self-supervised Learning, Multimodal Learning, image/video/3D/4D Generation, and related applications.

News
  • [Feb. 2025] One paper accepted to CVPR 2025!
  • [Oct. 2024] Two papers accepted to NeurIPS 2024!
  • [July 2024] Three papers accepted to ECCV 2024!
  • Academic Service
  • Conference Reviewer: CVPR, ICCV, ECCV, NeurIPS, ACM MM, AAAI
  • Journal Reviewer: TCSVT, JCST
  • Education & Experiences
    DAMO Academy, Alibaba Group
    Beijing, China
    July 2020 – Current
    Algorithm Expert at DAMO Academy
    Institute of Computing Technology (ICT), Chinese Academy of Sciences
    Beijing, China.
    Sep. 2017 – Jun. 2020
    M.S. in Computer Science.
    DAMO Academy, Alibaba Group
    Beijing, China
    Jun. 2019 – Sep. 2019
    Algorithm intern at Mind.
    Face++
    Beijing, China
    Mar. 2018 – Dec. 2018
    Algorithm intern at Detection Group.
    DeepGlint
    Beijing, China
    Oct. 2016 – Feb. 2017
    Algorithm intern at Algorithm Group.
    Intel
    Beijing, China
    July 2016 – Oct. 2016
    Development intern at Linux Kernel Dev and Test Group.
    Shandong University
    Shandong, China
    Sep. 2013 – Jun. 2017
    B.E. in Communication Engineering
    Awards & Scholarships
  • First place of LUAI Challenge on Learning to Understand Aerial Images, ICCV Workshop 2021. [] [] [LINK]
  • National Scholarship for Master students, Ministry of Education 2019.
  • Best Application Paper Award. IJCAI-19 Federated Machine Learning Workshop 2019. [LINK]
  • Talks
  • China3DV 2025: 面向3D/4D生成的探索及应用. [LINK]
  • Recent Publications
    mvgenmaster.png

    MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model

    Chenjie Cao, Chaohui Yu, Shang Liu, Fan Wang, Xiangyang Xue, Yanwei Fu.

    The Conference on Computer Vision and Pattern Recognition (CVPR-25)

    [PDF] [Project Page] [CODE]
    lpm.png

    LPM: Efficient 3D Content Creation from Single Image by Large-Scale Partial 3D Modeling.

    Yisu Zhang, Chaohui Yu, Fan Wang, Jianke Zhu.

    IEEE Transactions on Circuits and Systems for Video Technology (TCSVT-25)

    [PDF]
    animate3d.png

    Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

    Yanqin Jiang*, Chaohui Yu*, Chenjie Cao, Fan Wang, Weiming Hu, Jin Gao.

    Neural Information Processing Systems. (NeurIPS-24)

    [PDF] [Project Page]
    mvinpainter.png

    MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing

    Chenjie Cao, Chaohui Yu, Yanwei Fu, Fan Wang, Xiangyang Xue.

    Neural Information Processing Systems. (NeurIPS-24)

    [PDF] [Project Page]
    SC4D.png

    SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

    Zijie Wu, Chaohui Yu, Yanqin Jiang, Chenjie Cao, Fan Wang, Xiang Bai.

    European Conference on Computer Vision. (ECCV-24)

    [PDF] [Project Page]
    vec-texture.png

    VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing

    Shang Liu, Chaohui Yu, Chenjie Cao, Wen Qian, Fan Wang

    European Conference on Computer Vision. (ECCV-24)

    [PDF]
    meshsegmenter.png

    MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis

    Ziming Zhong, Yanxu Xu, Jing Li, Jiale Xu, Zhengxin Li, Chaohui Yu, Shenghua Gao

    European Conference on Computer Vision. (ECCV-24)

    [PDF] [CODE]
    points-to-3d.png

    Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation

    Chaohui Yu, Qiang Zhou, Jingliang Li, Zhe Zhang, Zhibin Wang, Fan Wang

    The 31th ACM International Conference on Multimedia. (ACMMM-23)

    [PDF]
    regionblip.png

    RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension

    Qiang Zhou, Chaohui Yu, Shaofeng Zhang, Sitong Wu, Zhibin Wang, Fan Wang

    Preprint

    [PDF]
    fmwiss.png

    Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation

    Chaohui Yu, Qiang Zhou, Jingliang Li, Jianlong Yuan, Zhibin Wang, Fan Wang

    The Conference on Computer Vision and Pattern Recognition (CVPR-23)

    [PDF]
    lmseg.png

    LMSeg: Language-guided Multi-dataset Segmentation

    Qiang Zhou, Yuang Liu, Chaohui Yu, Jingliang Li, Zhibin Wang, Fan Wang

    The International Conference on Learning Representations 2023. (ICLR-23)

    [PDF]
    mimco.png

    MimCo: Masked Image Modeling Pre-training with Contrastive Teacher

    Qiang Zhou, Chaohui Yu, Hao Luo, Zhibin Wang, Hao Li

    The 30th ACM International Conference on Multimedia. (ACMMM-22)

    [PDF]
    pss.png

    Object Detection Made Simpler by Eliminating Heuristic NMS

    Qiang Zhou*, Chaohui Yu*, Chunhua Shen*, Zhibin Wang, Hao Li

    IEEE Transactions on Multimedia (TMM-23)

    [PDF]
    instantteaching.png

    Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

    Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li

    The Conference on Computer Vision and Pattern Recognition (CVPR-21)

    [PDF] [CODE]
    daan.png

    Transfer Learning with Dynamic Adversarial Adaptation Network

    Chaohui Yu, Jindong Wang, Yiqiang Chen, Meiyu Huang

    Proceedings of IEEE International Conference on Data Mining (ICDM-19)

    [PDF] [CODE]