Skip to content

Conversation

@kunalspathak
Copy link
Contributor

RetainBigDiffFilesOnly will make jit-analyze to delete the .dasm files and retain just the top count files that contains largest diffs. This will help us in reducing the .dasm files size when we upload them as part of CI. By default, this option is off and hence we will never delete those files.

Thoughts?

@kunalspathak
Copy link
Contributor Author

kunalspathak commented Nov 16, 2021

@dotnet/jit-contrib , @BruceForstall - any thoughts?

Copy link
Contributor

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea seems good. Some comments/suggestions:

  1. We already have a list of the base/diff files (baseList / diffList in Main) in a FileInfo list, that includes the full path. Those paths are copied into the FileDelta structure. So it seems like we shouldn't do a directory search for files to delete, we should use the file paths we already have.
  2. What if you initialize a member of FileInfo (or FileDelta), retain = false, only used if the retain option is used, set it to true in the DisplayFileMetric code, then simply iterate over the lists towards the end of Main deleting files where this is false?
  3. Nit: name arg -retainOnlyTopFiles instead? Why? "Diff" implies it wouldn't delete "Base" files (which it would/should). "Top" implies the highest metric value, which might not be the "biggest". Maybe even more general: "-retainReportedFiles", meaning to retain any files on which we report a regression/improvement. We might choose to report both the largest, weighted, code size diffs. But we might also want to report the smallest, which are often the easiest to analyze. In fact, for the purpose of saving things in the CI, it would be quite useful to include both those sets.

@kunalspathak
Copy link
Contributor Author

Done.

@kunalspathak kunalspathak changed the title Add option of RetainBigDiffFilesOnly to jit-analyze Add option of retainOnlyTopFiles to jit-analyze Nov 17, 2021
@kunalspathak
Copy link
Contributor Author

Related PR where this option will be consumed - dotnet/runtime#61700

@kunalspathak
Copy link
Contributor Author

But we might also want to report the smallest, which are often the easiest to analyze. In fact, for the purpose of saving things in the CI, it would be quite useful to include both those sets.

I will keep that as a future improvement. Let us monitor and see how big the size of top (largest) subset is and if there is a need to see at smaller subset. I usually just pick the smallest of the top files that are reported.

Co-authored-by: Bruce Forstall <brucefo@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants