-
Notifications
You must be signed in to change notification settings - Fork 153
cccheck: Cached analysis gets horrendously slow over time #423
Description
For database-based caching, cccheck uses Entity Framework with lazy-loading. This means that whenever Method.Assemblies is touched, it loads data for every single assembly associated with that method. An example of a piece of code that triggers this data load is here. The query that this expression generates is:
exec sp_executesql N'SELECT
[Extent2].[AssemblyId] AS [AssemblyId],
[Extent2].[Name] AS [Name],
[Extent2].[Created] AS [Created],
[Extent2].[IsBaseLine] AS [IsBaseLine],
[Extent2].[SourceControlInfo] AS [SourceControlInfo]
FROM [dbo].[AssemblyInfoMethods] AS [Extent1]
INNER JOIN [dbo].[AssemblyInfo] AS [Extent2] ON [Extent1].[AssemblyInfo_AssemblyId] = [Extent2].[AssemblyId]
WHERE [Extent1].[Method_Id] = @EntityKeyValue1',N'@EntityKeyValue1 bigint',@EntityKeyValue1=4
An assembly entry in the database is a unique assembly being analyzed. This is not keyed to name, but appears to be maybe some sort of hash? I'm not quite sure. What I do know is that:
- Analyzing the built assemblies from a project multiple times does not result in new assembly entries.
- Rebuilding and analyzing the assemblies from a project multiple times results in new assembly entries for each rebuild.
Therefore, this operates at O(M x N) where:
- M is the number of assemblies that a given method appears in, per build, and
- N is the number of times a project has been built for static analysis
When analyzing the method System.Diagnostics.Contracts.ContractDeclarativeAssemblyAttribute.#ctor(), for example, which is added into every assembly that gets statically checked, cccheck loads in a tonne of records. At the scale that my team is operating at with multiple analyses per day of projects containing multiple assemblies, this loads in about 10,000 records per day since the cache was last cleaned.
This causes an enormous slowdown over time. A full build with a fresh cache takes my team about 15-20 minutes, but over time this can grow to 60-80 minutes with a very large cache database. For comparison, it takes around 60 minutes without a cache at all.
Looking at the code in Clousot, I don't see any way to fix this performance hit without rewriting the caching layer from scratch.
@SergeyTeplyakov @hubuk Any ideas on how to make this work faster?