-
Notifications
You must be signed in to change notification settings - Fork 5
Perform Detailed Performance Profiling #99
Copy link
Copy link
Closed
Description
Issue: Perform Detailed Performance Profiling
Description:
Use specialized profiling tools to accurately identify the primary performance bottlenecks in the entire workflow, guiding optimization efforts effectively. Don't optimize based on guesswork.
Tasks:
- Choose appropriate profiling tools for the target hardware (e.g., NVIDIA Nsight Systems/Compute, AMD rocprof for GPUs;
perf, Intel VTune, TAU, Score-P for CPU/MPI/OpenMP). - Run the simulation under the profiler for a representative case.
- Analyze the profile data to determine time spent in:
- Solver iterations vs. Solver/Preconditioner setup.
- MPI communication vs. Computation.
- CPU vs. GPU kernels (if applicable).
- CPU-GPU data transfers.
- Mask generation vs. Solve phase.
- I/O (e.g., plotfile writing).
- Prioritize optimization efforts based on the identified bottlenecks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels