Abstract

We introduce a novel system for human-to-robot trajectory transfer that enables robots to manipulate objects by learning from human demonstration videos. The system consists of four modules. The first module is a data collection module that is designed to collect human demonstration videos from the point of view of a robot using an AR headset. The second module is a video understanding module that detects objects and extracts 3D human-hand trajectories from demonstration videos. The third module transfers a human-hand trajectory into a reference trajectory of a robot end-effector in 3D space. The last module utilizes a trajectory optimization algorithm to solve a trajectory in the robot configuration space that can follow the end-effector trajectory transferred from the human demonstration. Consequently, these modules enable a robot to watch a human demonstration video once and then repeat the same mobile manipulation task in different environments, even when objects are placed differently from the demonstrations.

HRT1 Overview

Real World Execution

16 Tasks, 3 Trials each
Base movement: 5X
Task Execution: 1X

BibTeX

Please cite our work if it helps your research:
@misc{2025hrt1,
  title={HRT1: One-Shot Human-to-Robot Trajectory Transfer for Mobile Manipulation}, 
  author={Sai Haneesh Allu* and Jishnu Jaykumar P* and Ninad Khargonkar and Tyler Summers and Jian Yao and Yu Xiang},
  year={2025},
  eprint={2510.21026},
  archivePrefix={arXiv},
  primaryClass={cs.RO},
  url={https://arxiv.org/abs/2510.21026}, 
}

Contact

Send any comments or questions to Sai | Jishnu:
saihaneesh.allu@utdallas.edu | jishnu.p@utdallas.edu

Acknowledgements

This work was supported in part by the National Science Foundation (NSF) under Grant Nos. 2346528 and 2520553, the NVIDIA Academic Grant Program Award, and a gift funding from XPeng.