Constructing robots to accomplish long-horizon tasks is a long-standing challenge within artificial intelligence. Approaches using generative methods, particularly Diffusion Models, have gained attention due to their ability to model continuous robotic trajectories for planning and control. However, we show that these models struggle with long-horizon tasks that involve complex decision-making and, in general, are prone to confusing different modes of behavior, leading to failure. To remedy this, we propose to augment continuous trajectory generation by simultaneously generating a high-level symbolic plan. We show that this requires a novel mix of discrete variable diffusion and continuous diffusion, which dramatically outperforms the baselines. In addition, we illustrate how this hybrid diffusion process enables flexible trajectory synthesis, allowing us to condition synthesized actions on partial and complete discrete conditions.
Trajectory-only Diffusion Models struggle with long-horizon decision making tasks, such as sorting tasks. These are highly multimodal tasks, and while all demonstrations terminate in a sorted state, the trained model mix behaviours in the dataset, resulting in flawed plans. Hybrid Diffusion Planning (HDP), by combining the planning task with the diffusion of a discrete symbolic plan, however, shows remarkable ability to solve complex tasks.
Method | X-Arm Sorting |
Arrange Blocks |
Hook Task |
---|---|---|---|
Diffuser | 46% | 67% | 38% |
Joint Diffuser | 41% | 61% | 48% |
Separate Diffuser | 38% | 62% | 43% |
Hybrid (Ours) | 83% | 74% | 60% |
In addition to experimental benchmarks, we measure the robustness of all methods as task complexity increases by varying the number of blocks to sort. We find that Hybrid Diffusion Planning is significantly stronger than the baselines.
Method | Sorting Task | Hook Task |
---|---|---|
Diffuser | 20% | 6.7% |
Joint Diffuser | 10% | 10% |
Separate Diffuser | 0% | 6.7% |
Hybrid (Ours) | 70% | 60% |