Since the removal of the high tolerance in #3964, we have been noticing intermittent failures of the regression test in the CI on main. This is surprising because, on main, the regression tests for a commit abcd will use the assets just generated for that same commit abcd--there should be no difference.
I have explored whether this was non-determinism in the solver by changing to use the Clarabel quadratic programming problem solver #3971 however the problems persisted. That is not to say this was a bad decision and should be reversed, Clarabel is what we were recommended to use some years ago by the OSQP developers.
I have then checked whether the data going into and out of the QPP solve is exactly the same. On my Mac, the arrays going into and out of the QPP in the LT-NOF case are exactly the same at all iterations, down to the last decimal place. However, when I compare my Mac and Linux VM, there are slight very small differences (attached below). This difference is present in the data coming from PROCESS on the first iteration and therefore does not result from VMCON/QPP solver. As the solver runs, this error accumulates by a couple of orders of magnitude.
To verify that this is not new, I have gone back to a version of PROCESS before this change (58b2c96), downversioned to PyVMCON 2.3.1 and CVXPY 1.6. This is what the CI would have run using when it reliably passed. I then ran this test again between my Mac and Linux VM. The results show that there is indeed a difference between my Mac and Linux VM from the first iteration for the data coming out of PROCESS. The MAEs for the PROCESS data (deq, ie, die) on the first iteration are the same as the previous experiment. Unlikely in the previous experiments, the differences remain around the same order of magnitude and remains below the reporting threshold for PROCESS' test suite, even when running with 0% tolerance. Indeed in our test suite we do not use an exact 0 tolerance for this reason
|
# Define relative tolerance |
|
# Use pytest's default relative tolerance (1e-6) |
|
# 0 tolerance causes floating-point discrepancies |
|
# between local and CI runs or tolerance |
|
# is a percentage, rel arg takes a fraction |
|
rel_tolerance = None if (tolerance == 0) else (tolerance / 100) |
|
|
|
try: |
|
# Use pytest.approx for relative and absolute comparisons: |
|
# handles values close to 0 |
|
assert new_value == pytest.approx(ref_value, rel=rel_tolerance) |
It is my opinion that reducing the QPP solver tolerance has just made apparent an issue which has existed in PROCESS all along. There are slight discrepancies between our results on different systems, but a lower QPP tolerance causes these to accumulate over a PROCESS run and results in very slightly different solver paths and hence solutions.
Differences in solve_qsp data between Mac and Linux VM for a LT-NOF PROCESS run using Clarabel QPP solver
deq@0 is different: MAE 1.6653345369377348e-13
ie@0 is different: MAE 9.472422846101836e-13
die@0 is different: MAE 4.274358644806853e-12
delta@0 is different: MAE 1.6758816556716738e-13
lamda_equality@0 is different: MAE 2.0961010704922955e-13
lamda_inequality@0 is different: MAE 6.326015812316642e-14
x@1 is different: MAE 1.674216321134736e-13
B@1 is different: MAE 2.8350655156827997e-12
f@1 is different: MAE 2.1316282072803006e-14
df@1 is different: MAE 4.196643033083092e-14
eq@1 is different: MAE 2.7755575615628914e-14
deq@1 is different: MAE 4.313216450668733e-13
ie@1 is different: MAE 6.543654507140673e-13
die@1 is different: MAE 2.3058943643405883e-11
delta@1 is different: MAE 2.0623919239071142e-12
lamda_equality@1 is different: MAE 7.405187574249794e-13
lamda_inequality@1 is different: MAE 8.139044993527023e-13
x@2 is different: MAE 2.0881074647149944e-12
B@2 is different: MAE 3.877786980410747e-11
f@2 is different: MAE 4.707345624410664e-14
df@2 is different: MAE 4.196643033083092e-14
eq@2 is different: MAE 6.061817714453355e-14
deq@2 is different: MAE 5.402345237826012e-13
ie@2 is different: MAE 6.633915639042698e-11
die@2 is different: MAE 1.3681287214240001e-09
delta@2 is different: MAE 1.4854523860963198e-12
lamda_equality@2 is different: MAE 1.201594379551807e-12
lamda_inequality@2 is different: MAE 6.062511603843745e-13
x@3 is different: MAE 8.271161533457416e-13
B@3 is different: MAE 5.5859317171780276e-11
f@3 is different: MAE 2.107203300738547e-13
df@3 is different: MAE 2.1538326677728037e-14
eq@3 is different: MAE 2.90878432451791e-14
deq@3 is different: MAE 3.643751966819764e-13
ie@3 is different: MAE 1.4147460980495907e-11
die@3 is different: MAE 2.496385320682748e-10
delta@3 is different: MAE 1.1022294188478554e-12
lamda_equality@3 is different: MAE 1.049249576112743e-11
lamda_inequality@3 is different: MAE 2.063987869505013e-12
x@4 is different: MAE 5.626610288800293e-13
B@4 is different: MAE 5.4939164328970946e-11
eq@4 is different: MAE 5.240252676230739e-14
deq@4 is different: MAE 6.319389456166391e-13
ie@4 is different: MAE 9.53481738008577e-12
die@4 is different: MAE 1.2168399621259596e-10
delta@4 is different: MAE 4.3482535280747747e-13
lamda_equality@4 is different: MAE 3.824274230623814e-12
lamda_inequality@4 is different: MAE 2.698188894534326e-13
x@5 is different: MAE 4.0101255649460654e-13
B@5 is different: MAE 6.160050247672189e-11
eq@5 is different: MAE 3.774758283725532e-15
deq@5 is different: MAE 2.7400304247748863e-13
ie@5 is different: MAE 3.019806626980426e-13
die@5 is different: MAE 3.26405569239796e-11
delta@5 is different: MAE 3.207095266032122e-10
lamda_equality@5 is different: MAE 9.877559855112072e-10
lamda_inequality@5 is different: MAE 3.334670087805358e-10
Differences in solve_qsp data between Mac and Linux VM for a LT-NOF PROCESS run using OSQP QPP solver (very high relative tolerance), VMCON 2.3.1, CVXPY 1.6, and an older version of PROCESS
deq@0 is different: MAE 1.6653345369377348e-13
ie@0 is different: MAE 3.1086244689504383e-15
die@0 is different: MAE 4.274358644806853e-12
delta@0 is different: MAE 1.8360313269738526e-13
lamda_equality@0 is different: MAE 1.9129142714291447e-13
lamda_inequality@0 is different: MAE 8.171241461241152e-14
x@1 is different: MAE 1.836308882730009e-13
B@1 is different: MAE 4.397149311330395e-12
f@1 is different: MAE 1.7985612998927536e-14
df@1 is different: MAE 1.0502709812953981e-13
eq@1 is different: MAE 2.4646951146678475e-14
deq@1 is different: MAE 5.344613640545504e-13
ie@1 is different: MAE 3.232969447708456e-13
die@1 is different: MAE 2.0210277895671425e-11
delta@1 is different: MAE 3.3911068397785016e-13
lamda_equality@1 is different: MAE 2.0672352718520415e-13
lamda_inequality@1 is different: MAE 8.836265052991621e-13
x@2 is different: MAE 3.6648462042876417e-13
B@2 is different: MAE 7.680611702198803e-11
f@2 is different: MAE 1.48991929904696e-13
df@2 is different: MAE 4.196643033083092e-14
eq@2 is different: MAE 6.261657858885883e-14
deq@2 is different: MAE 3.02174951727352e-13
ie@2 is different: MAE 2.128741627416275e-12
die@2 is different: MAE 3.342037757647631e-11
delta@2 is different: MAE 3.720634911275056e-13
lamda_equality@2 is different: MAE 1.6369128275073308e-12
lamda_inequality@2 is different: MAE 5.029865413064272e-13
x@3 is different: MAE 4.984901380566953e-13
B@3 is different: MAE 7.907985377642035e-11
f@3 is different: MAE 5.10702591327572e-14
eq@3 is different: MAE 7.815970093361102e-14
deq@3 is different: MAE 4.289901767151605e-13
ie@3 is different: MAE 7.254197242900773e-13
die@3 is different: MAE 2.4063417924935493e-11
delta@3 is different: MAE 4.373723605510804e-13
lamda_equality@3 is different: MAE 4.987676938128516e-12
lamda_inequality@3 is different: MAE 8.842232501748981e-13
x@4 is different: MAE 6.967759702547482e-13
B@4 is different: MAE 7.737455121059611e-11
eq@4 is different: MAE 2.4646951146678475e-14
deq@4 is different: MAE 8.699707620962727e-13
ie@4 is different: MAE 1.535327420754129e-12
die@4 is different: MAE 3.527134140313137e-11
delta@4 is different: MAE 1.2453059416994705e-13
lamda_equality@4 is different: MAE 1.3394840792102514e-13
lamda_inequality@4 is different: MAE 3.3113052230748785e-14
x@5 is different: MAE 6.7390537594747e-13
B@5 is different: MAE 6.901146321069973e-11
eq@5 is different: MAE 6.661338147750939e-16
deq@5 is different: MAE 5.551115123125783e-13
ie@5 is different: MAE 6.225020499073253e-13
die@5 is different: MAE 1.7564616427989677e-11
delta@5 is different: MAE 2.93050306243714e-13
lamda_equality@5 is different: MAE 5.5297454983060934e-14
lamda_inequality@5 is different: MAE 3.992444395917172e-14
x@6 is different: MAE 7.194245199571014e-13
B@6 is different: MAE 5.74296166178101e-11
f@6 is different: MAE 4.440892098500626e-16
eq@6 is different: MAE 6.661338147750939e-16
deq@6 is different: MAE 4.969358258222201e-13
ie@6 is different: MAE 8.643086246706844e-13
die@6 is different: MAE 2.0371260234242072e-11
delta@6 is different: MAE 1.6653692314072543e-13
lamda_equality@6 is different: MAE 4.535130951333066e-14
lamda_inequality@6 is different: MAE 3.3518327002823867e-14
Since the removal of the high tolerance in #3964, we have been noticing intermittent failures of the regression test in the CI on main. This is surprising because, on main, the regression tests for a commit
abcdwill use the assets just generated for that same commitabcd--there should be no difference.I have explored whether this was non-determinism in the solver by changing to use the Clarabel quadratic programming problem solver #3971 however the problems persisted. That is not to say this was a bad decision and should be reversed, Clarabel is what we were recommended to use some years ago by the OSQP developers.
I have then checked whether the data going into and out of the QPP solve is exactly the same. On my Mac, the arrays going into and out of the QPP in the
LT-NOFcase are exactly the same at all iterations, down to the last decimal place. However, when I compare my Mac and Linux VM, there are slight very small differences (attached below). This difference is present in the data coming from PROCESS on the first iteration and therefore does not result from VMCON/QPP solver. As the solver runs, this error accumulates by a couple of orders of magnitude.To verify that this is not new, I have gone back to a version of PROCESS before this change (58b2c96), downversioned to PyVMCON 2.3.1 and CVXPY 1.6. This is what the CI would have run using when it reliably passed. I then ran this test again between my Mac and Linux VM. The results show that there is indeed a difference between my Mac and Linux VM from the first iteration for the data coming out of PROCESS. The MAEs for the PROCESS data (
deq,ie,die) on the first iteration are the same as the previous experiment. Unlikely in the previous experiments, the differences remain around the same order of magnitude and remains below the reporting threshold for PROCESS' test suite, even when running with0%tolerance. Indeed in our test suite we do not use an exact 0 tolerance for this reasonPROCESS/tests/regression/test_process_input_files.py
Lines 192 to 202 in 8dd9e07
It is my opinion that reducing the QPP solver tolerance has just made apparent an issue which has existed in PROCESS all along. There are slight discrepancies between our results on different systems, but a lower QPP tolerance causes these to accumulate over a PROCESS run and results in very slightly different solver paths and hence solutions.
Differences in
solve_qspdata between Mac and Linux VM for a LT-NOF PROCESS run using Clarabel QPP solverDifferences in
solve_qspdata between Mac and Linux VM for a LT-NOF PROCESS run using OSQP QPP solver (very high relative tolerance), VMCON 2.3.1, CVXPY 1.6, and an older version of PROCESS