As a continuation of #1943, I managed to automate the selection of best models via the @juanrojochacon's hyperopt algorithm wherein data of 1/$\varphi^{2}$ is used to decide on the best $\chi^{2}$ hyperpoint. Here I am just referring to it as best_chi2_worse_phi2 algorithm.
To this end, I made a post-fit script which is primarily based on the validphys vp_hyperoptplot.py module. I did so in such a way to make our implementation easier later. Just in case I attach it here analysis_hyperopt.zip.
The core of the idea is presented in the code snippet below:
args = {
'loss_target': 'best_chi2_worst_phi2', # select Juan & Roy's algorithm
'max_phi2_points': 10, # select the n lowest values of 1/phi2
'threshold': 3.0,
}
if args.loss_target == "best_chi2_worst_phi2":
minimum = dataframe.loss[best_idx]
std = np.std(dataframe.loss)
lim_max = dataframe.loss[best_idx] + std
# select rows with chi2 losses within the best point and lim_max
selected_chi2 = dataframe[(dataframe.loss >= minimum) & (dataframe.loss <= lim_max)]
# among the selected points, select the nth lowest in 1/phi2
selected_phi2 = selected_chi2.loss_reciprocal_phi2.nsmallest(args.max_phi2_points)
# find the location of these points in the dataframe
indices = dataframe[dataframe['loss_reciprocal_phi2'].isin(selected_phi2)].index
best_trial = dataframe.loc[indices]
Here, I define an internal between the chi2 minimum and 1 standard deviation std from which I will monitor later on the corresponding 1/phi2 values. For these, I get the nth lowest 1/phi2 hyperpoints and save the selected models into best_trial. In the zip attached file I take as example the runs I discussed on Monday using 10 replicas (because I have much more points to test the algorithm). The final plot is show below:

The yellow region defines the interval between chi2 minimum (grey circle) and 1 standard deviation std of the loss data. I also asked the script to give me 10 models within this region which show the lowest 1/phi2's (cyan circles).
Questions
- Is 1
std sufficient for our purposes ? Note that for the analysis I selected a loss threshold of 3. So, all models showing higher losses were excluded from the DataFrame and analysis.
- When looking at 1/phi2 values which option is more physically sound and the best: (i). 1/ < phi2 > or (ii). <1/phi2> ? Note that in the analysis I use <1/phi2>.
- Is the idea to implement this later in
validphys ? I tried to run the vp-hyperoptplot but it always complains about the need for pandoc (even if I have pandoc installed).
I would appreciate any comments and idea to improve are always welcome.
As a continuation of #1943, I managed to automate the selection of best models via the @juanrojochacon's hyperopt algorithm wherein data of 1/$\varphi^{2}$ is used to decide on the best $\chi^{2}$ hyperpoint. Here I am just referring to it as
best_chi2_worse_phi2algorithm.To this end, I made a post-fit script which is primarily based on the validphys vp_hyperoptplot.py module. I did so in such a way to make our implementation easier later. Just in case I attach it here analysis_hyperopt.zip.
The core of the idea is presented in the code snippet below:
args = { 'loss_target': 'best_chi2_worst_phi2', # select Juan & Roy's algorithm 'max_phi2_points': 10, # select the n lowest values of 1/phi2 'threshold': 3.0, } if args.loss_target == "best_chi2_worst_phi2": minimum = dataframe.loss[best_idx] std = np.std(dataframe.loss) lim_max = dataframe.loss[best_idx] + std # select rows with chi2 losses within the best point and lim_max selected_chi2 = dataframe[(dataframe.loss >= minimum) & (dataframe.loss <= lim_max)] # among the selected points, select the nth lowest in 1/phi2 selected_phi2 = selected_chi2.loss_reciprocal_phi2.nsmallest(args.max_phi2_points) # find the location of these points in the dataframe indices = dataframe[dataframe['loss_reciprocal_phi2'].isin(selected_phi2)].index best_trial = dataframe.loc[indices]Here, I define an internal between the chi2

minimumand 1 standard deviationstdfrom which I will monitor later on the corresponding 1/phi2 values. For these, I get the nth lowest 1/phi2 hyperpoints and save the selected models intobest_trial. In the zip attached file I take as example the runs I discussed on Monday using 10 replicas (because I have much more points to test the algorithm). The final plot is show below:The yellow region defines the interval between chi2
minimum(grey circle) and 1 standard deviationstdof the loss data. I also asked the script to give me 10 models within this region which show the lowest 1/phi2's (cyan circles).Questions
stdsufficient for our purposes ? Note that for the analysis I selected a loss threshold of 3. So, all models showing higher losses were excluded from the DataFrame and analysis.validphys? I tried to run thevp-hyperoptplotbut it always complains about the need forpandoc(even if I havepandocinstalled).I would appreciate any comments and idea to improve are always welcome.