By Stanford University
Share this project to show your support
HumanPlus is an open-source humanoid robot developed by researchers at Stanford University. It is capable of learning and accurately reproducing complex sequences of human movements, including in real-time
The robot is based on Unitree’s commercial humanoid H1 robot, but with a completely different software platform and some other third-party components. HumanPlus is equipped with arms made by Inspire-Robots and wrists made by another supplier, and stands at about 175 cm tall.
One of the key features of HumanPlus is that it only needs 40 hours of video footage of people performing a task to learn how to repeat those actions on its own. Additionally, thanks to its built-in camera, the robot can follow the movements of people in real-time and immediately imitate them.
HumanPlus is open-source, which means that anyone can modify its structure and functionality. The developers have published a GitHub repository with detailed documentation, allowing engineers and enthusiasts to build a similar model themselves.
The total cost of the components used to create HumanPlus is publicly available, with rough calculations estimating it at $107,945. Considering the robot’s wide range of motion, this figure seems quite competitive compared to alternatives on the market today.
The introduction of HumanPlus coincided with the announcement by Elon Musk that more than 1,000 Optimus humanoid robots will be working in Tesla factories by the end of the year
Project Website: https://humanoid-ai.github.io/
The code contains the updating implementation for the Humanoid Shadowing Transformer (HST) and the Humanoid Imitation Transformer (HIT), along with instructions for whole-body pose estimation and the associated hardware codebase.
Reinforcement learning in simulation is based on legged_gym and rsl_rl.
Install IsaacGym v4 first from the official source. Place the isaacgym folder inside the HST folder.
cd HST/rsl_rl && pip install -e . cd HST/legged_gym && pip install -e .
To train HST:
python legged_gym/scripts/train.py --run_name 0001_test --headless --sim_device cuda:0 --rl_device cuda:0
To play a trained policy:
python legged_gym/scripts/play.py --run_name 0001_test --checkpoint -1 --headless --sim_device cuda:0 --rl_device cuda:0
Imitation learning in the real world is based on ACT repo and Mobile ALOHA repo.
conda create -n HIT python=3.8.10 conda activate HIT pip install torchvision pip install torch pip install pyquaternion pip install pyyaml pip install rospkg pip install pexpect pip install mujoco==2.3.7 pip install dm_control==1.0.14 pip install opencv-python pip install matplotlib pip install einops pip install packaging pip install h5py pip install ipython pip install getkey pip install wandb pip install chardet pip install h5py_cache cd HIT/detr && pip install -e .
Collect your own data or download our dataset from here and place it in the HIT folder.
To set up a new terminal, run:
conda activate HIT cd HIT
To train HIT:
# Fold Clothes task python imitate_episodes_h1_train.py --task_name data_fold_clothes --ckpt_dir fold_clothes/ --policy_class HIT --chunk_size 50 --hidden_dim 512 --batch_size 48 --dim_feedforward 512 --lr 1e-5 --seed 0 --num_steps 100000 --eval_every 100000 --validate_every 1000 --save_every 10000 --no_encoder --backbone resnet18 --same_backbones --use_pos_embd_image 1 --use_pos_embd_action 1 --dec_layers 6 --gpu_id 0 --feature_loss_weight 0.005 --use_mask --data_aug --wandb
Hardware codebase is based on unitree_ros2.
Install unitree_sdk
Install unitree_ros2
conda create -n lowlevel python=3.8 conda activate lowlevel
Install nvidia-jetpack
Install torch==1.11.0 and torchvision==0.12.0:
Please refer to the following links:
https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048
https://docs.nvidia.com/deeplearning/frameworks/install-pytorch-jetson-platform/index.html
Put your trained policy in the `hardware-script/ckpt` folder and rename it to `policy.pt`
conda activate lowlevel cd hardware-script python hardware_whole_body.py --task_name stand
For body pose estimation, please refer to WHAM.
For hand pose estimation, please refer to HaMeR.