Skip to content

Extract benchmarking timers into a separate class#4291

Merged
Priya2698 merged 6 commits intomainfrom
pm/timer
Apr 24, 2025
Merged

Extract benchmarking timers into a separate class#4291
Priya2698 merged 6 commits intomainfrom
pm/timer

Conversation

@Priya2698
Copy link
Collaborator

The motivation is to use them in Thunder.

@github-actions
Copy link

github-actions bot commented Apr 23, 2025

Review updated until commit 34189d3

Description

  • Extracted benchmarking timers into separate classes.

  • Introduced TorchProfileTimer and FusionProfileTimer.

  • Refactored NVFBenchmark to use these new timer classes.


Changes walkthrough 📝

Relevant files
Enhancement
core.py
Refactor benchmark timers                                                               

benchmarks/python/core.py

  • Removed inline timer functions.
  • Added import for TorchProfileTimer and FusionProfileTimer.
  • Modified __init__ to use new timer classes.
  • Updated set_fd method to set fd in FusionProfileTimer.
  • Simplified cleanup method to call self._timer.cleanup().
  • +12/-85 
    benchmark_utils.py
    Add new timer classes                                                                       

    nvfuser/benchmark_utils.py

  • Created base Timer class.
  • Implemented TorchProfileTimer class.
  • Implemented FusionProfileTimer class.
  • +105/-0 

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🧪 No relevant tests
    ⚡ Recommended focus areas for review

    Possible Issue

    The cleanup method in NVFBenchmark class calls self._timer.cleanup(), but the TorchProfileTimer and FusionProfileTimer classes do not have a cleanup method defined. This could lead to a NotImplementedError or unexpected behavior.

    self._timer.cleanup()
    Code Duplication

    The set_fd method in NVFBenchmark class asserts that self._timer is an instance of FusionProfileTimer. This assumption might not hold if other timer classes are introduced in the future, leading to potential runtime errors.

    def set_fd(self, fd):
        assert isinstance(self._timer, FusionProfileTimer)
        self._timer.set_fd(fd)
    Missing Documentation

    The new Timer, TorchProfileTimer, and FusionProfileTimer classes lack detailed docstrings explaining their purpose, methods, and usage. This could make it difficult for other developers to understand and use these classes effectively.

    # Base class for all timers used by pytest-benchmark.
    class Timer:
        def __init__(self):
            self.current_time = 0.0
    
        def _increment_global_time(self, elapsed_time: float) -> None:
            self.current_time += elapsed_time
    
        def __call__(self):
            raise NotImplementedError("Subclass must implement this method")
    
        def cleanup(self):
            pass
    
    
    class TorchProfileTimer(Timer):
        def __init__(self):
            super().__init__()
            self.prof = profile(activities=[ProfilerActivity.CUDA])
    
        def _get_kernel_time(
            self, prof_averages: torch.autograd.profiler_util.EventList
        ) -> float:
            """
            Arguments:
                prof_averages: Output of self.prof.key_averages()
            Returns:
                time_value: Elapsed CUDA time in seconds.
            """
            elapsed_cuda_time = 0
            has_cuda_event = False
            for event in prof_averages:
                if event.device_type != DeviceType.CUDA:
                    continue
                has_cuda_event = True
                # Re: torch profiler API changes in https://github.com/pytorch/pytorch/pull/123247
                elapsed_cuda_time = (
                    elapsed_cuda_time + event.self_device_time_total
                    if hasattr(event, "self_device_time_total")
                    else event.self_cuda_time_total
                )
            assert has_cuda_event, "No CUDA events found"
            return elapsed_cuda_time / 1e6
    
        def __call__(self):
            """
            Custom torchprofiler-based timer used by pytest-benchmark.
            At every timer call, the profiler is stopped to compute the elapsed CUDA time
            and the global clock is incremented. The profiler is restarted before returning to continue tracing.
    
            Returns:
                self.current_time: Global monotonic clock variable
            """
            try:
                self.prof.stop()
            except AssertionError:
                self.prof.start()
                return self.current_time
    
            prof_averages = self.prof.key_averages()
            elapsed_cuda_time = self._get_kernel_time(prof_averages)
            self._increment_global_time(elapsed_cuda_time)
            # Clear the internal profiler object to avoid accumulating function events and then restart the profiler
            # See PR: https://github.com/pytorch/pytorch/pull/125510
            self.prof.profiler = None
    
            return self.current_time
    
        def cleanup(self):
            """
            Stops a running torchprofiler instance if found.
            """
            try:
                self.prof.stop()
            except AssertionError:
                pass
    
    
    class FusionProfileTimer(Timer):
        def __init__(self):
            super().__init__()
            self.fd = None
            # Specifies if the timer in host measurement is called at the start/finish of execution.
            # Timings are measured at the end of execution.
            self.execution_start = True
    
        def set_fd(self, fd):
            self.fd = fd
    
        def __call__(self):
            if not self.execution_start:
                profile = self.fd.profile()
                elapsed_host_time = profile.host_time_ms / 1e3
                self._increment_global_time(elapsed_host_time)
            self.execution_start = not self.execution_start
            return self.current_time

    @Priya2698
    Copy link
    Collaborator Author

    !test --pybench

    Copy link
    Collaborator

    @jjsjann123 jjsjann123 left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    stamping for mechanical code movement.

    @Priya2698
    Copy link
    Collaborator Author

    !build

    @Priya2698
    Copy link
    Collaborator Author

    !build

    @Priya2698 Priya2698 merged commit 7477e4b into main Apr 24, 2025
    16 checks passed
    @Priya2698 Priya2698 deleted the pm/timer branch April 24, 2025 03:43
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants