I am currently an ELLIS PhD student jointly supervised by Prof. Renaud Detry at KU Leuven and Prof. Luc Van Gool at INSAIT. My research topic focus on Cross-modality Representation for Robotic Policy Learning.
I graduated at Beihang University (Beijing University of Aeronautics and Astronautics, BUAA), from ShenYuan Honors College with a Bachelor’s degree and from the University’s Robotics Institute with a Master’s degree, supervised by Prof. Wang Wei. Prior to beginning my Ph.D program, I also gained valuable research experience at Samsung Research and ETH Zurich.
While my current research focuses on data-driven perception and control (people call it Embodied AI nowadays…), I also have a strong passion for hardware design and real world experiments, stemming from my background as an engineering student who embarked from the field of Mechatronics.
For more details, please check my published papers about Robot Perception and Robot Learning. You can also have a look at my previous tiny projects about robotics and automation.
📝 Publications
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches
Yutong Hu*, Pinhao Song, Kehan Wen, Renaud Detry
M3PC: Test-time Model Predictive Control for Pretrained Masked Trajectory Model
Kehan Wen†, Yutong Hu†, Yao Mu*, Lei Ke*
- Enhance Masked Transformer for Offline RL by employing versatile capabilities from the Model itself for runtime Predictive Control.
- Achieve better performance in offline RL and offline-to-online RL for both simulated and real-world robotic tasks, with additional goal-reaching capabilities.
DexDribbler: Learning Dexterous Soccer Manipulation via Dynamic Supervision
Yutong Hu*, Kehan Wen, Fisher Yu, Yifan Liu
- Make quadrupedal robot able to dribble and kick soccer ball using only ego-vision camera and onboard sensors.
- Transfer the skill from Sim to Real by using a virtual Feedback Controller to guide the Deep Reinforcement Learning process of a implicit Policy Network from Massively Parallel Simulation.
Making Parameterization and Constrains of Object Landmark Globally Consistent via SPD (3) Manifold
Yutong Hu, Wei Wang*
- Propose A Mono-camera SLAM system that can provide map with sematic-meaningful Ellipsoid landmarks to represent object in the indoor scenes.
- Further improvements on the representation of the 9-DOF object landmarks (Rotation, Translation, Scale), resulting in an improved Object SLAM system with faster and more accurate back-end mapping manifold optimization process.
RA-L & ICRA 2022 SO-SLAM: Semantic object slam with scale proportional and symmetrical texture constraints Ziwei Liao, Yutong Hu, Jiadong Zhang, Xianyu Qi, Xiaoyu Zhang, Wei Wang*
⚙️ Tiny Projects

Dragon-like Worm





📖 Educations
- 2024.11 - 2025 (Now): Ph.D candidate, Research Unit of Robotics, Automation and Mechatronics (RAM), KU Leuven
- 2020.09 - 2023.03: M.Phil., the Robotics Institute, School of Mechanical Engineering and Automation, Beihang University
- 2016.09 - 2020.06: B.Eng., ShenYuan Honors College, Beihang University
💻 Internships
- 2023.07 - 2024.10: Research Assistant @ Visual Intelligence and System Group, Computer Vision Lab, ETH Zurich.
- 2022.05 - 2022.11: Research Intern @ Samsung Research Center, Beijing.
🎖 Honors and Awards
- 2023.03: Graduate Excellence in Beijing Province
- 2023.01: Samsung 2022 Best Intern
- 2023.01: Outstanding Master Thesis Award in Beihang University
- 2022.12: Robotics Institute Founder’s Scholarship (1/year)
- 2016, 2018-2019, 2020, 2022: Annual Full Scholarship