It seems that with scaling=True the BayesianOptimizer will optimize the initial model twice.
In the CTor of BayesianOptimizer a call to acquisition.enable_scaling will optimize the model(s), while BayesianOptimizer._optimize() optimizes the model(s) again.
If I read this correctly I think we should try to minimize the number of unnecessary model optimizations. To avoid future control flow problems like this we might want to reduce the places where models are optimized, if possible.