FreeMusco: Motion-Free Learning of Latent Control for Morphology-Adaptive Locomotion in Musculoskeletal Characters

FreeMusco is a motion-free framework that learns a latent representation of morphology-adaptive locomotion from musculoskeletal simulation, without motion data. The learned latent space enables high-level control for downstream tasks such as goal navigation and path following. The figure shows: (1) Humanoid locomotion control, (2) diverse motions sampled from the latent space, (3) goal navigation with Ostrich, and (4) path following with Chimanoid.
System Overview
Abstract
We propose FreeMusco, a motion-free framework that jointly learns latent representations and control policies for musculoskeletal characters. By leveraging the musculoskeletal model as a strong prior, our method enables energy-aware and morphology-adaptive locomotion to emerge without motion data. The framework generalizes across human, non-human, and synthetic morphologies, where distinct energy-efficient strategies naturally appear—for example, quadrupedal gaits in Chimanoid versus bipedal gaits in Humanoid. The latent space and corresponding control policy are constructed from scratch, without demonstration, and enable downstream tasks such as goal navigation and path following—representing, to our knowledge, the first motion-free method to provide such capabilities. FreeMusco learns diverse and physically plausible locomotion behaviors through model-based reinforcement learning, guided by the locomotion objective that combines control, balancing, and biomechanical terms. To better capture the periodic structure of natural gait, we introduce a temporally averaged loss formulation, which compares simulated and target states over a time window rather than on a per-frame basis. We further encourage behavioral diversity by randomizing target poses and energy levels during training, enabling locomotion to be flexibly modulated in both form and intensity at runtime. Together, these results demonstrate that versatile and adaptive locomotion control can emerge without motion capture, offering a new direction for simulating movement in characters where data collection is impractical or impossible.
Humanoid Locomotion
Goal Velocity-Only
Goal Velocity + Direction
Goal Velocity + Energy + Pose (symmetric)
Goal Velocity + Energy + Pose (asymmetirc)
Morphology-Adaptive Locomotion
Chimanoid: Goal Velocity-Only
Ostrich: Goal Velocity-Only
Ostrich: Goal Velocity + Pose + Energy
Emergent Gait Strategies Across Morphologies
Chimanoid: Goal Velocity + Energy
Humanoid: Goal Velocity + Energy
If initialized in a bipedal pose, Chimanoid walks bipedally at high energy but shifts to a quadrupedal gait as energy decreases. And when the energy is raised again, it does not return to bipedal walking but instead exhibits a stronger qudarupedal gait, indiciating quadrupedal motion is more efficient for its body. Humanoid, however, never adopts quadrupedal gaits—showing that energy-efficient strategies depend on morphology and can naturally emerge in our motion-free framework.
Unconditional Random Sampling in Latent Space
Humanoid Latent Space
Trained with Velocity-Only Goals
Humanoid Latent Space
Trained with Velocity + Pose + Energy Goals
Chimanoid Latent Space
Trained with Velocity-Only Goals
Ostrich Latent Space
Trained with Velocity + Pose + Energy Goals
Downstream Tasks: Point-Goal Navigation
Humanoid
Chimanoid
Ostrich
Downstream Tasks: Path Following
Humanoid
Chimanoid
Effect of Temporally Averaged Loss Formulation
Temporally-averaged Lpose (Ours)
Per-step Lpose
Temporally-averaged Lup (Ours)
Per-step Lup
(1) Per-step enforces rigid frame-level pose matching, leading to slightly crouched, short-stepped gait with suppressed pelvic rotation.
(2) Per-step strongly constrains pelvic dynamics, so pelvic rotation appears nearly rigid without natural oscillatory components.
Comparison with Torque-Actuated Humanoid
Muscle-Actuated (Ours)
Torque-Actuated
Torque-Actuated
(with manually tuned torque limits)
Torque-actuated humanoid failed to learn even basic balanced walking behavior. Even with manual tuning of joint torque limits, it only learned a very short-stepped, high-frequency locomotion.
Discussion
Target Pose:
Standing pose with straight arms
Target Pose:
Standing pose with slightly bent elbows
Learning with target poses featuring slightly bent elbows mitigates the straight and stiff arm artifact, suggesting that searching for optimal target poses would be a valuable direction for future work.
Video
Paper
Publisher: Coming soon
arXiv: Coming soon