Genesis: Embodiment Co-Design via Efficient Message and Reward Delivery

Anonymous Institution from Earth
NeurIPS 2024 Submission

Embodied Agents generated by our embodiment co-design approach, Genesis .

Abstract

Embodiment co-design aims to optimize a robot's morphology and control simultaneously, enhancing its environmental adaptability. However, this task faces significant challenges due to its vast search space and bi-level optimization nature. Existing approaches have used reinforcement learning to unify the optimization of morphology and control but still face the drawback of inefficient message and reward delivery. To address this problem, we propose an end-to-end reinforcement learning framework, Genesis, that optimizes message and reward delivery during co-design. Genesis utilizes (1) a morphology self-attention architecture to achieve zero-delay message delivery during optimization, with our proposed topology-aware positional encoding to achieve message localization and morphological knowledge sharing; (2) a temporal credit assignment mechanism that ensures an agent receives balanced reward signals in both the morphology design and control phases. Experiments across various tasks demonstrate Genesis's superiority over previous methods in terms of both convergence speed and task performance.

Motivations

Neural systems with localized message processing are found in simple low-level organisms such as planarians, where sensory information are connected through neural networks, for GNN-like distributed and localized processing. In contrast, advanced creatures such as human beings utilize a centralized signal processing approach, where signals from various body parts are centrally processed in the brain, leveraging scalability advantages similar to the self-attention mechanism within transformers.

Methods

Genesis is an RL-based framework for embodiment co-design, which optimizes morphology and control simultaneously. It achieves centralized, zero-delayed message processing based on limb-level self-attention, message localization via Topology Position Encoding, and enhanced temporal credit assignment for balanced reward signals.

For embodiment co-design, besides differentiating message source within the body, we require an evolution-aware morphology representation. We propose Topology Position Encoding (TopoPE), to better adapt to the evolving process. It not only achieves message localization but also faciliates knowledge sharing within the dynamically growing body.

Visualization

We randomly pick embodied agents generated by Genesis for visualization.

Crawler

TerrainCrosser

Cheetah

Swimmer

Glider-Regular

Glider-Medium

Glider-Hard

Walker-Regular

Walker-Medium

Walker-Hard

Experiments

Experiments show that Genesis achieves an average 60.52% performance improvement over the strongest baselines. Notably, each model generated by Genesis is only 10-20 MB in size, but has already exhibited exciting capabilities in morphology evolution and motion control, demonstrating promising potential for scaling-up towards more complex embodied agents and embodiment systems in the expected future.