Controller Hierarchy
The high-level policy handles navigation intent using LiDAR-like observations, relative goal position, and lookahead path hints. The low-level policy converts those instructions into stable gait and joint-position behavior.
Robotics Project
A simulation-first quadruped project centered on hierarchical RL navigation, obstacle-aware locomotion, and the robot platform design that made the system legible end to end.
The main throughline is the autonomy stack: hierarchical PPO control, LiDAR-guided observations, and goal-reaching behavior in simulation. From there, the page transitions into the mechanical design work around actuators, power, and staged hardware validation.
Project Preview
Use this preview to move straight into the SLAM and locomotion work before transitioning into the underlying robot design track.
Section One
The navigation stack uses a hierarchical reinforcement learning system where a high-level controller feeds motion commands to a low-level locomotion controller, while RayCaster observations and path hints help the robot navigate toward goals.
The high-level policy handles navigation intent using LiDAR-like observations, relative goal position, and lookahead path hints. The low-level policy converts those instructions into stable gait and joint-position behavior.
Training ran with 4096 parallel simulated Unitree Go2 robots in Isaac Lab, with NVIDIA PhysX 5 handling terrain interaction, collisions, and sensor simulation across flat and rough terrain.
The locomotion controller improved across both flat and rough environments, with especially strong learning on rough terrain and recovery from early foot-slide penalties as the gait matured.
Exteroception materially improved navigation on rough ground, and reward shaping was essential to prevent the agent from exploiting the simulator with poor-but-legal movement shortcuts.
Technical PDF + Showcase
Baseline locomotion clip showing the policy stabilizing movement before denser navigation tasks.
Navigation showcase clip highlighting goal-directed traversal behavior.
Additional behavior sample showing the higher-level navigation policy working with locomotion control.
One failure clip is included to show how the navigation stack behaved when the policy broke down under harder conditions.
Section Two
The autonomy work sat alongside a staged design track: validate a single leg in simulation first, then bench test one leg assembly, then scale into a more complete quadruped platform with the right power, sensing, and control backbone.
Design Track
The design centers on actuator selection, PCB and power planning, and a single-leg validation loop before committing to a full hardware build. That keeps the project grounded in testable milestones instead of a vague end-state robot concept.
The electrical stack includes MCU control, actuator options, a 4S-6S LiPo battery path, fused power distribution, DC-DC rails, and CAN / UART / USB routing between Jetson-class compute and motor drivers.
The initial plan uses Isaac Sim and ROS2 to test a single leg, validate link lengths and joint limits, check actuator torque margins against estimated mass, and trial gait behavior before risking hardware.
Build details include chamfers on bolt holes, actuator mounting constraints, output plate alignment, and practical fastener / standoff clearances that matter once the CAD becomes hardware.