At the planning stage, is the simulation process of maze2d environment consistent with that of the diffuser? If so, the number of planning steps required to reach the end of the maze should be basically determined by the value of planning horizon. But AdaptDiffuser seems to get a significantly higher score than diffuser. How to understand the mechanism of high score? Thank you.