Fix FAST.Farm issues with OMP (segfaults mostly)#2711
Merged
andrew-platt merged 14 commits intoOpenFAST:rc-4.0.3from Apr 8, 2025
Merged
Fix FAST.Farm issues with OMP (segfaults mostly)#2711andrew-platt merged 14 commits intoOpenFAST:rc-4.0.3from
andrew-platt merged 14 commits intoOpenFAST:rc-4.0.3from
Conversation
The `close(Un)` is not atomic, so it may not have released the file before declaring the unit available. This can cause issues with opening a whole bunch of files simulataneously. There are other places that need this fix as well.
deslaughter
approved these changes
Apr 1, 2025
Co-authored-by: Derek Slaughter <deslaughter@gmail.com>
Changed fileopenNWTCio_critical to fileopen_critical so all file open is the same OMP critical
Also remove from Read84AryWDefault. Somehow this was triggering a segfault with IFX. No idea how.
Segmentation faults can occur if the OMP PARALLEL DO has enough private memory per thread that it exceeds the default OMP_STACKSIZE="4 M". If this happens, `export OMP_STACKSIZE="32 M"` or suitably large value. Calculating the values for this don't exactly work out as I would expect, but are in the ballpark (see code note)
The routine isn't actually used... yet. But for completeness adding the critical around the close so it isn't an issue later when I actually use the routine
2468228 to
5361499
Compare
ReadHighResWindFile
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ready to merge
Feature or improvement description
There have been unit number collisions with
OMPparallelization aroundReadHighResWindFilecausing FAST.Farm to frequently fail for unknown reasons. This problem was tracked down using the IFX compiler and may partially exist with GCC with OpenMP.A few changes:
GetNewUnitwith revised logic2^16-1 = 65535- this should work for most clusters, but may cause issues on smaller machines (increase it withulimit -n #on *nix machines, no clue how to change on Windows).close(Un)in!$OMP critical(fileopen_critical)- this was the main problemdebugflagnonuninit(forgot to add in Add "nouninit" to debug flags for IntelLLVM #2709)OMP stacksize(in x.y days)to the simulation status message (useful for really long simulations where end day is ambiguous)Related issue, if one exists
Many issues in the past, some reported on GH (sorry, you'll have to search as I'm feeling lazy right now)
Impacted areas of the software
FAST.Farm should see a small increase in speed, but more importantly should no longer have unit number collisions between threads
Additional supporting information
NOTE: there are likely other
close(Unit)locations that need this!$OMP critical(fileopen_critical)wrappingTest results, if applicable
Tests are run single threaded to reduce load on GH actions, so no changes there.
Thanks to @deslaughter for help in debugging.