FreeMusco: Motion-Free Learning of Latent Control for Morphology-Adaptive Locomotion in Musculoskeletal Characters

SIGGRAPH Asia 2025

Hanyang University

Publisher arXiv Slides (PDF) Slides (PPTX) Code Poster

teaser
FreeMusco is a motion-free framework that learns a latent representation of morphology-adaptive locomotion from musculoskeletal simulation, without motion data. The learned latent space enables high-level control for downstream tasks such as goal navigation and path following. The figure shows: (1) Humanoid locomotion control, (2) diverse motions sampled from the latent space, (3) goal navigation with Ostrich, and (4) path following with Chimanoid.

Humanoid Locomotion

Goal Velocity-Only

Goal Velocity + Direction

Goal Velocity + Energy + Pose (symmetric)

Goal Velocity + Energy + Pose (asymmetirc)

Morphology-Adaptive Locomotion

Chimanoid: Goal Velocity-Only

Ostrich: Goal Velocity-Only

Ostrich: Goal Velocity + Pose + Energy

Emergent Gait Strategies Across Morphologies

Chimanoid: Goal Velocity + Energy

Humanoid: Goal Velocity + Energy

If initialized in a bipedal pose, Chimanoid walks bipedally at high energy but shifts to a quadrupedal gait as energy decreases. And when the energy is raised again, it does not return to bipedal walking but instead exhibits a stronger qudarupedal gait, indiciating quadrupedal motion is more efficient for its body. Humanoid, however, never adopts quadrupedal gaits—showing that energy-efficient strategies depend on morphology and can naturally emerge in our motion-free framework.

Unconditional Random Sampling in Latent Space

Humanoid Latent Space
Trained with Velocity-Only Goals

Humanoid Latent Space
Trained with Velocity + Pose + Energy Goals

Chimanoid Latent Space
Trained with Velocity-Only Goals

Ostrich Latent Space
Trained with Velocity + Pose + Energy Goals

Downstream Tasks: Point-Goal Navigation

Humanoid

Chimanoid

Ostrich

Downstream Tasks: Path Following

Humanoid

Chimanoid

Method

System Overview

Locomotion Objective Loss

FreeMusco builds on conditional VAE and model-based RL, guided by the locomotion objective loss with temporally averaged terms, to learn energy-aware latent locomotion directly from muscle dynamics and goal signals without motion data.

Effect of Temporally Averaged Loss Formulation

Temporally-averaged L_pose (Ours)

Per-step L_pose

Temporally-averaged L_up (Ours)

Per-step L_up

(1) Per-step enforces rigid frame-level pose matching, leading to slightly crouched, short-stepped gait with suppressed pelvic rotation.
(2) Per-step strongly constrains pelvic dynamics, so pelvic rotation appears nearly rigid without natural oscillatory components.

Comparison with Torque-Actuated Humanoid

Muscle-Actuated (Ours)

Torque-Actuated

Torque-Actuated
(with manually tuned torque limits)

Torque-actuated humanoid failed to learn even basic balanced walking behavior. Even with manual tuning of joint torque limits, it only learned a very short-stepped, high-frequency locomotion.

Discussion

Target Pose:
Standing pose with straight arms

Target Pose:
Standing pose with slightly bent elbows

Learning with target poses featuring slightly bent elbows mitigates the straight and stiff arm artifact, suggesting that searching for optimal target poses would be a valuable direction for future work.

Abstract

We propose FreeMusco, a motion-free framework that jointly learns latent representations and control policies for musculoskeletal characters. By leveraging the musculoskeletal model as a strong prior, our method enables energy-aware and morphology-adaptive locomotion to emerge without motion data. The framework generalizes across human, non-human, and synthetic morphologies, where distinct energy-efficient strategies naturally appear—for example, quadrupedal gaits in Chimanoid versus bipedal gaits in Humanoid. The latent space and corresponding control policy are constructed from scratch, without demonstration, and enable downstream tasks such as goal navigation and path following—representing, to our knowledge, the first motion-free method to provide such capabilities. FreeMusco learns diverse and physically plausible locomotion behaviors through model-based reinforcement learning, guided by the locomotion objective that combines control, balancing, and biomechanical terms. To better capture the periodic structure of natural gait, we introduce a temporally averaged loss formulation, which compares simulated and target states over a time window rather than on a per-frame basis. We further encourage behavioral diversity by randomizing target poses and energy levels during training, enabling locomotion to be flexibly modulated in both form and intensity at runtime. Together, these results demonstrate that versatile and adaptive locomotion control can emerge without motion capture, offering a new direction for simulating movement in characters where data collection is impractical or impossible.

Video

Paper

Publisher: page, paper
arXiv: page, paper