Recently I've done some optimization experiments using two different forms of optimization algorithms: those that allow for a bit of "backtracking" (i.e. sometimes allowing transitions to a state of lower fitness), and those that do not (i.e. "Hill-Climbing").
What I've found is that it is overwhelmingly the case that optimization methods that make use of backtracking allow for the discovery of superior solutions to those that do not.
Unfortunately, this does not explain to me WHY this is the case! I have started to postulate some toy models of the fitness landscape that could conceivably explain these observations.