While we have the timing decorator in utils, this only registers cpu time which can be a poor indication of the actual proccessing time. It would be nice to have some default tools for profiling that takes gpu time into account.
Pytorch does supply a profiler (torch.autograd.profiler) however the interface is more complex than most uses cases require.
I suggest we provide a couple of common use cases of torches profler implemented in decorators for easy usage.