FAST.Farm speedup with OMP#2730
Closed
andrew-platt wants to merge 17 commits intoOpenFAST:rc-4.0.3from
Closed
Conversation
The `close(Un)` is not atomic, so it may not have released the file before declaring the unit available. This can cause issues with opening a whole bunch of files simulataneously. There are other places that need this fix as well.
Co-authored-by: Derek Slaughter <deslaughter@gmail.com>
Changed fileopenNWTCio_critical to fileopen_critical so all file open is the same OMP critical
Also remove from Read84AryWDefault. Somehow this was triggering a segfault with IFX. No idea how.
Segmentation faults can occur if the OMP PARALLEL DO has enough private memory per thread that it exceeds the default OMP_STACKSIZE="4 M". If this happens, `export OMP_STACKSIZE="32 M"` or suitably large value. Calculating the values for this don't exactly work out as I would expect, but are in the ballpark (see code note)
The routine isn't actually used... yet. But for completeness adding the critical around the close so it isn't an issue later when I actually use the routine
Collaborator
Author
|
Merge after #2711 |
a174b2f to
05ec200
Compare
05ec200 to
a6afa3a
Compare
Collaborator
Author
|
Going to cherry-pick pieces out of this for 4.0.3, then revamp the changes in parallelization (too much OMP overhead how I set it up). |
Collaborator
Author
|
Closing this. Testing showed very little additional speed increases |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ready to merge
Feature or improvement description
Added some
!$OMP paralleldirectives around obvious pieces that could be parallelized.A few changes:
COLLAPSE(2)aroundReadHighResWindFilein AWAE.f90 to include inner loopComputeLocalsLowResGridCalcOutput(second part)Related issue, if one exists
#2711
Impacted areas of the software
FAST.Farm may see a speed increase, and should see more effective multi-threading.
Additional supporting information
In the process of profiling, we found a few places that were obviously missing parallelization. One place that remains is the low resolution wind grid reading -- that is all contained in a single VTK, so there is no way to speed it up in the present form.
Test results, if applicable
Test results will not change.