Project Cache initial implementation#5936
Conversation
| CacheHit, | ||
| CacheMiss, | ||
| CacheNotApplicable, | ||
| CacheError |
There was a problem hiding this comment.
Is it worth keeping CacheError? I'm thinking of dropping it in favor of just checking PluginLoggerBase.HasLoggedErrors. If we keep both, what should the user or msbuild do if the plugin returns CacheError but no errors are logged? Seems simpler to just keep one mechanism for signaling errors.
There was a problem hiding this comment.
Yeah I agree, the way its mostly done is its success unless an error was logged.
There was a problem hiding this comment.
+1 for using PluginLoggerBase.HasLoggedErrors
There was a problem hiding this comment.
K, I'll remove CacheError
There was a problem hiding this comment.
So it turns out CacheError may have a good use: when the plugin is queried for a project but it encounters and logs one or more errors, what should the plugin return? It could either return a null CacheResult or a CacheResult of result type CacheError (returning a CacheMiss or CacheNotApplicable doesn't seem a good fit for the error case). But then there's the above ambiguity when a plugin could return CacheError without logging any errors, which I guess could be fine.
An alternative is to remove the enum entirely. On cache hits the plugin returns a CacheResult with the build results (or proxy targets and whatnot). On anything else (cache miss, cache not applicable, cache error) the plugin returns a null CacheResult and logs why it couldn't satisfy the request. Worst case we get a plugin that returns null and does not log anything.
I am inclined now to keep CacheError in order to make the return modes explicit.
Let me know what you think.
There was a problem hiding this comment.
I'm slightly more inclined to return null on any failure case. The main question in my mind is whether you'd react differently to CacheMiss, CacheNotApplicable, and CacheError, but the right response, from what I can tell, is to pretend the cache doesn't exist and build normally for all three cases. May as well just have a single "build everything" return value.
There was a problem hiding this comment.
Decided to just replace CacheError with a None value. Hit, Miss, and NA are domain level concepts that sort of define the meaning of a plugin so I like to have them explicit. Without them, we'd rely on arbitrary plugin implementations on how they express these 3 possibilities, versus MSBuild giving a standardized message for each type of response. Another concrete use case for having them is to enable MSBuild to report the cache hit ratio in the build summary (hits / (hits + misses), need NA to not count them as misses). That's a very useful metric for cacheability health (which, just like perf, is something that tends to degrade over time).
| ErrorUtilities.VerifyThrow(!submission.IsCompleted, "Submission already complete."); | ||
|
|
||
| lock (_syncLock) | ||
| if (ProjectCacheIsPresent()) |
There was a problem hiding this comment.
Start review here. This method is the entrypoint from where the logic starts diverting when a cache is available.
| CacheHit, | ||
| CacheMiss, | ||
| CacheNotApplicable, | ||
| CacheError |
There was a problem hiding this comment.
Yeah I agree, the way its mostly done is its success unless an error was logged.
| CacheHit, | ||
| CacheMiss, | ||
| CacheNotApplicable, | ||
| CacheError |
There was a problem hiding this comment.
+1 for using PluginLoggerBase.HasLoggedErrors
| using Microsoft.Build.BackEnd; | ||
| using Microsoft.Build.Collections; | ||
| using Microsoft.Build.Evaluation; | ||
| using Microsoft.Build.Experimental.ProjectCache; |
There was a problem hiding this comment.
how long do we plan on keeping this in the Experimental namespace?
There was a problem hiding this comment.
I'd like to keep it at least until we validate it working for two of our internal build accelerators.
| }) | ||
| .ToArray()); | ||
|
|
||
| var cacheItems = nodeToCacheItems.Values.SelectMany(i => i).ToHashSet(); |
There was a problem hiding this comment.
What's the purpose of SelectMany here?
There was a problem hiding this comment.
I'm not sure I follow why the conversion to an array then a hashset, can't the list of items be stored as a hashset to begin with?
There was a problem hiding this comment.
It's inconsequential, I think. Each node can declare zero or more plugins, so first I construct a dictionary from a node to the collection of plugins it declares. But the contract is that there can be a single plugin (path + plugin settings), and all nodes must declare that plugin. So in order to find that single plugin I flatten (SelectMany) the collections of plugins from each node into a single set (ProjectCacheItem implements Equals and GetHashCode to make this correct). I could have skipped the dictionary by flattening everything from the start, but I also want to give a nice error message with all the nodes that may be missing the plugin. And given that this method is not on a hot path, I went with a lot of LINQ :)
There was a problem hiding this comment.
I'm not sure I follow everything here. Wouldn't this be union rather than intersection, so wouldn't this find all the plugins declared by any node? How does this help you find the one from all of them? And I'm not sure how I follow how this turns into an error.
There was a problem hiding this comment.
This part constructs a dictionary from every node to all the declared plugins in that node:
https://github.com/cdmihai/msbuild/blob/acb4b7e2e0ffbd4a9df13c6fba11dd6aa5f37944/src/Build/BackEnd/BuildManager/BuildManager.cs#L1866-L1882
This part flattens (via SelectMany) all the declared plugins into a single set which removes the duplicates according to the overriden equals and hashcode in ProjectCacheItem.
https://github.com/cdmihai/msbuild/blob/acb4b7e2e0ffbd4a9df13c6fba11dd6aa5f37944/src/Build/BackEnd/BuildManager/BuildManager.cs#L1884
The set should contain a single item if all nodes declare a single plugin (plugin path + plugin settings).
Error cases:
- set contains more than 1 plugins (error prints all declared plugins)
- set contains a single plugin but there are nodes which do not declare it (error prints all nodes that do not declare the plugin)
There was a problem hiding this comment.
Neat! Didn't realize SelectMany also flattened. If I were writing it, I might have said nodeToCacheItems.Values.Aggregate((x, y) => x.Union(y));, but I don't think that's better (or worse) than what you currently have.
I was originally confused in thinking that a given node could declare multiple plugins, and it was ok as long as only one of those was declared by all nodes, but I see that was wrong. Makes more sense now.
d6193b9 to
d688951
Compare
d260ada to
a196184
Compare
|
/azp run #Resolved |
|
Azure Pipelines successfully started running 1 pipeline(s). #Resolved |
a196184 to
acb4b7e
Compare
Forgind
left a comment
There was a problem hiding this comment.
I've essentially looked at documentation + BuildManager so far. I know we don't agree about all the omnisharp styling things, but it also shows up in my diff without whitespace changes, which makes it more confusing.
| var solutionPath = config.Project.GetPropertyValue(SolutionProjectGenerator.SolutionPathPropertyName); | ||
|
|
||
| ErrorUtilities.VerifyThrow( | ||
| solutionPath != null && !string.IsNullOrWhiteSpace(solutionPath) && solutionPath != "*Undefined*", |
There was a problem hiding this comment.
If someone opens a single project rather than a solution, we should be able to use that as an "entrypoint" rather than the solution, right? Good fallback?
There was a problem hiding this comment.
Good point. I'll check and see what global properties VS sets in that case.
There was a problem hiding this comment.
VS lies when opened on a single project and reports a non existing solution path. For now I'll leave it as is and reconsider if it turns to be a common case.
| }) | ||
| .ToArray()); | ||
|
|
||
| var cacheItems = nodeToCacheItems.Values.SelectMany(i => i).ToHashSet(); |
There was a problem hiding this comment.
I'm not sure I follow everything here. Wouldn't this be union rather than intersection, so wouldn't this find all the plugins declared by any node? How does this help you find the one from all of them? And I'm not sure how I follow how this turns into an error.
Forgind
left a comment
There was a problem hiding this comment.
This is a little large to review too deeply, but I think I have a reasonable surface-level understanding now, thanks! I recognize that you wanted to imitate what is returned from an actual build (which clearly would allocate a lot), but I'm wondering if @ladipro might want to look, since he's been working to reduce (even temporary) allocations.
| CacheHit, | ||
| CacheMiss, | ||
| CacheNotApplicable, | ||
| CacheError |
There was a problem hiding this comment.
I'm slightly more inclined to return null on any failure case. The main question in my mind is whether you'd react differently to CacheMiss, CacheNotApplicable, and CacheError, but the right response, from what I can tell, is to pretend the cache doesn't exist and build normally for all three cases. May as well just have a single "build everything" return value.
| }) | ||
| .ToArray()); | ||
|
|
||
| var cacheItems = nodeToCacheItems.Values.SelectMany(i => i).ToHashSet(); |
There was a problem hiding this comment.
Neat! Didn't realize SelectMany also flattened. If I were writing it, I might have said nodeToCacheItems.Values.Aggregate((x, y) => x.Union(y));, but I don't think that's better (or worse) than what you currently have.
I was originally confused in thinking that a given node could declare multiple plugins, and it was ok as long as only one of those was declared by all nodes, but I see that was wrong. Makes more sense now.
ff9d2fa to
eca9219
Compare
|
No pipelines are associated with this pull request. |
|
/azp run |
Also cleanup needless double interface declarations
Merge had some failing preconditions that do not apply in this case
Co-authored-by: Forgind <Forgind@users.noreply.github.com>
This reverts commit 2d35e8c.
benvillalobos
left a comment
There was a problem hiding this comment.
Posting comments as I go. Haven't quite dug into the "meat and potatoes" yet.
| var buildParameters = _buildParametersPrototype.Clone(); | ||
|
|
||
| using (var buildManagerSession = new Helpers.BuildManagerSession(_env, _buildParametersPrototype)) | ||
| using (var buildManagerSession = new Helpers.BuildManagerSession(_env, buildParameters)) |
There was a problem hiding this comment.
why do we need to use a copy of the buildparameters here?
There was a problem hiding this comment.
Because the BuildManagerSession mutates it to add some loggers and other BM cleanup options to avoid impacting other tests.
rainersigwald
left a comment
There was a problem hiding this comment.
Haven't gotten through everything here but no objection. Merge away while I'm out :)
| <CopyNuGetImplementations>false</CopyNuGetImplementations> | ||
| <GenerateAssemblyInfo>false</GenerateAssemblyInfo> | ||
|
|
||
| <TargetFrameworks>netcoreapp2.1</TargetFrameworks> |
Forgind
left a comment
There was a problem hiding this comment.
Not 100% confident, but I don't think any of the new new code will run unless the user specifies it.
Ready for review. Start with the documentation which should describe all the big changes here.
Todos:
Plugin.EndBuildAsyncwhenPlugin.EndBuildAsyncalso throws an exceptionIssues addressed in future PRs:
MSBuildFileSystemto the plugin for both graph and non graph scenarios.Since it's a bigger PR you might consider using CodeFlow to make it easier to review, it has a chrome extension: https://www.1eswiki.com/wiki/CodeFlow_integration_with_GitHub_Pull_Requests (msft internal link)