jaintle/diffusion-policy-manipulation: v0.1.0 — Diffusion Policy Execution Strategy Study (PushT)

Overview This release presents a controlled empirical study of diffusion-based action-sequence modeling for manipulation, focusing on execution strategy. We compare: Gaussian MLP Behavior Cloning (BC) Diffusion Policy (Open-loop execution) Diffusion Policy (Receding-horizon execution) All experiments are: Deterministic Multi-seed (3 seeds) CPU reproducible Fully scripted end-to-end Environment: gym_pusht/PushT-v0Observation: Low-dimensional stateHorizon: 8DDIM steps (K): 10Diffusion T: 50 (linear schedule) Multi-Seed Results (3 Seeds) | Method | Return (mean ± std) | |-----------------------|--------------------| | Gaussian BC | 3.98 ± 1.60 | | Diffusion (Open-loop) | 7.70 ± 1.99 | | Diffusion (Receding)| **7.72 ± 1.98 | Per-seed returns: | Seed | BC | Diff Open | Diff Receding | |------|-----|-----------|---------------| | 0 | 1.97| 5.65 | 5.68 | | 1 | 4.09| 7.05 | 7.07 | | 2 | 5.88| 10.40 | 10.41 | Key Observations Diffusion-based sequence modeling consistently outperforms Gaussian BC in this small-data PushT setting. Open-loop and receding-horizon execution strategies produce nearly identical performance under fixed horizon (H=8) and DDIM steps (K=10). Execution strategy differences do not materially manifest in this short-horizon regime. Note: success_rate is not used as the primary metric in this setup, as PushT does not expose a binary success signal under the current evaluator configuration. Return is the primary metric. Reproducibility To reproduce the multi-seed experiment: python scripts/reproduce_multiseed.py \ --env_id gym_pusht/PushT-v0 \ --seeds 0 1 2 \ --episodes_record 20 \ --max_steps_record 200 \ --steps_bc 3000 \ --steps_diff 5000 \ --episodes_eval 20 \ --max_steps_eval 200 \ --results_root results/rq_exec_mode \ --device cpu python scripts/aggregate_results.py \ --results_root results/rq_exec_mode \ --seeds 0 1 2 Plots are generated via: python scripts/plot_summary.py python scripts/plot_per_seed.py Scope & Limitations Single environment (PushT) Low-dimensional state input Fixed horizon (H=8) Fixed sampler steps (K=10) No vision encoder No latency benchmarking No sim-to-real claims This release isolates execution strategy under controlled conditions.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average