Skip to content

Conversation

@MaxSagebaum
Copy link
Contributor

@MaxSagebaum MaxSagebaum commented Jul 3, 2025

Proposed Changes

The current solution to fix the problem in SU2_CFD/include/solvers/CFVMFlowSolverBase.inl for preaccumulation is not optimal.

The AD::GetValue calls will disable the dependency on omegaMax but the computation of the norm will be taped. Making the tape passive will remove this overhead.

Analysis of the previous error (before the tape recording debug mode)

The code:

628    for (unsigned long iPoint = 0; iPoint < nPoint; ++iPoint) {
   1 
   2     const auto VelocityGradient = nodes->GetVelocityGradient(iPoint);
   3     auto Vorticity = nodes->GetVorticity(iPoint);
   4 
   5     /*--- Vorticity ---*/
   6   
   7     Vorticity[0] = 0.0;
   8     Vorticity[1] = 0.0;
   9     Vorticity[2] = VelocityGradient(1,0)-VelocityGradient(0,1);
  10   
  11     if (nDim == 3) {
  12       Vorticity[0] = VelocityGradient(2,1)-VelocityGradient(1,2);
  13       Vorticity[1] = -(VelocityGradient(2,0)-VelocityGradient(0,2));
  14     }
  15 
  16     /*--- Strain Magnitude ---*/
  17                                
  18     const su2double vy = nodes->GetVelocity(iPoint, 1);
  19     const su2double y = geometry->nodes->GetCoord(iPoint, 1);
  20     AD::StartPreacc();
  21     AD::SetPreaccIn(VelocityGradient, nDim, nDim);
  22     AD::SetPreaccIn(vy, y);
  23   
  24     StrainMag(iPoint) = 0.0;
  25   
  26     /*--- Add diagonal part ---*/
  27 
  28     for (unsigned long iDim = 0; iDim < nDim; iDim++) {
  29       StrainMag(iPoint) += pow(VelocityGradient(iDim, iDim), 2);
  30     }
  31     if (config.GetAxisymmetric() && y > EPS) {
  32       StrainMag(iPoint) += pow(vy / y, 2);
  33     }
  34     
  35     /*--- Add off diagonals ---*/
  36   
  37     StrainMag(iPoint) += 2.0*pow(0.5*(VelocityGradient(0,1) + VelocityGradient(1,0)), 2);
  38     
  39     if (nDim == 3) {
  40       StrainMag(iPoint) += 2.0*pow(0.5*(VelocityGradient(0,2) + VelocityGradient(2,0)), 2);
  41       StrainMag(iPoint) += 2.0*pow(0.5*(VelocityGradient(1,2) + VelocityGradient(2,1)), 2);
  42     }
  43   
  44     StrainMag(iPoint) = sqrt(2.0*StrainMag(iPoint));
  45     AD::SetPreaccOut(StrainMag(iPoint));
  46   
  47     /*--- Max is not differentiable, so we not register them for preacc. ---*/
  48     strainMax = max(strainMax, StrainMag(iPoint));
  49     omegaMax = max(omegaMax, GeometryToolbox::Norm(3, Vorticity));
  50 
  51     AD::EndPreacc();
  52   }
  53   END_SU2_OMP_FOR
  54   
  55   if ((iMesh == MESH_0) && (config.GetComm_Level() == COMM_FULL)) {
  56     SU2_OMP_CRITICAL {
  57       StrainMag_Max = max(StrainMag_Max, strainMax);
  58       Omega_Max = max(Omega_Max, omegaMax);
  59     }
  60     END_SU2_OMP_CRITICAL
  61                   
  62     BEGIN_SU2_OMP_SAFE_GLOBAL_ACCESS
  63     {
  64       su2double MyOmega_Max = Omega_Max;
  65       su2double MyStrainMag_Max = StrainMag_Max;
  66                                                      
  67       SU2_MPI::Allreduce(&MyStrainMag_Max, &StrainMag_Max, 1, MPI_DOUBLE, MPI_MAX, SU2_MPI::GetComm());
  68       SU2_MPI::Allreduce(&MyOmega_Max, &Omega_Max, 1, MPI_DOUBLE, MPI_MAX, SU2_MPI::GetComm());
  69     }
  70     END_SU2_OMP_SAFE_GLOBAL_ACCESS
  71   }

In line 48 and 49 strainMax and omegaMax are assigned. They get a CoDiPack id and for omegaMax some statements are recorded. In line 51 the preaccumulation is finished. This removes the statements for the computation of omegaMax. Nevertheless, the ids for these variables are still there. They have now an invalid id and this id will be given to a completely random other variable. If these variables are not used, then this would not be a problem. After the loop in line 57 and 58 they are used, which creates a random link in the tape, which can have arbitrary consequences. So we have a real bug here in SU2_CFD_AD.

In the new Version this bug does no longer exist but the overhead of the computation of the norm is now recorded.

Analysis of the fluctuation in test disc_adj_fsi

I wanted to analyze what is going wrong. On my machine, I get a good result with optimized binaries. I get a wrong result with debug binaries. So there seems something else going on in this test case.

PR Checklist

Put an X by all that apply. You can fill this out after submitting the PR. If you have any questions, don't hesitate to ask! We want to help. These are a guide for you to know what the reviewers will be looking for in your contribution.

  • [ x] I am submitting my contribution to the develop branch.
  • [ x] My contribution generates no new compiler warnings (try with --warnlevel=3 when using meson).
  • [ x] My contribution is commented and consistent with SU2 style (https://su2code.github.io/docs_v7/Style-Guide/).
  • [ x] I used the pre-commit hook to prevent dirty commits and used pre-commit run --all to format old commits.
  • [ -] I have added a test case that demonstrates my contribution, if necessary.
  • [ -] I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp), if necessary.

Fix a bug in AD::PauseTaping and AD::ResumeTaping.
@MaxSagebaum MaxSagebaum changed the title WIP: Reduce the tapeing overhead in CFVMFlowSolverBase.inl. Reduce the tapeing overhead in CFVMFlowSolverBase.inl. Jul 4, 2025
@MaxSagebaum
Copy link
Contributor Author

Pipeline seems fine now. I removed the WIP.

@pcarruscag pcarruscag merged commit 5627e49 into su2code:develop Jul 4, 2025
37 of 38 checks passed
@MaxSagebaum
Copy link
Contributor Author

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants