-
Notifications
You must be signed in to change notification settings - Fork 847
Optimized find all references and reduced memory usage in VS #8339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ification from allocating a lot
|
This is ready for the most part. I added a new public lexing API and marked it as experimental; we really really need a better API for lexing. My hope is this will improve over time. This was really needed for find all references for syntactic classification so I didn't have to use the other one. I only wanted to get classification for a small span of text; though it isn't perfect and will not be accurate for tokens that span multiple lines. |
|
@dsyme I think this can be reviewed now. The big things are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks great
My question is really about the memory-mapped file. You say "it uses memory but not the process's memory". I don't understand this. My impression of memory mapped files is that they are "mapped into the process's address space" and my mental model is that if a mmap is 1GB big then 1GB of process address space is used. The actual contents of the file may or may not be in physical memory but that's true for anything from the process address space - the contents are only brought into physical memory as needed but the memory mapping does consume virtual address space, which is 4GB limited for VS.
So if that's correct then this MemoryMappedFile burns VS devenv.exe address space? I thought if you wanted to get the data out of the process address space then you'd have to use an actual file on disk, like a temporary file??
my mental model is that if a mmap is 1GB big then 1GB of process address space is used.
Now it could be that the above statement is somehow wrong for ViewStream over mmap files. If so could you include a link to definitive documentation about that? Or a sample that shows that you can create, say, 10x1GB mmap streams (using the combination of calls we are using here to create them) in a 32 bit process, and have them all live and accessible?
| [<Sealed>] | ||
| type DocumentCache<'Value when 'Value : not struct>() = | ||
| let cache = new MemoryCache("fsharp-cache") | ||
| let policy = CacheItemPolicy(SlidingExpiration = TimeSpan.FromSeconds 2.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 2.0 seconds here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything under 2 seconds, I believe, the caching stops working, meaning it won't actually cache the item. It's a really stupid bug. So, 2 seconds is really the minimum that we can go IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be added as a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it tunable via an environment variable, like a lot of other such settings? I always think it's good practice in case we have to have customers do in situ testing of a different setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could expose this as a setting, but I really think we should not. What we have should just work without any tweaking. Adding more time to this could make it worse; remember if the we cache the same item again it will reset the sliding expiration time back to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be a setting.
vsintegration/src/FSharp.Editor/Classification/ClassificationService.fs
Outdated
Show resolved
Hide resolved
|
Thank you for the feedback @dsyme . I'll be looking over everything. |
|
Memory-Mapped Files
This is what we are doing, with the exception of using it for IPC. I don't think it is possible to even share the information by IPC because we give the MMF name a "null" value. Though, as found by @baronfel, Mono, unfortunately, does not allow a "null" name in their MMF impl, but Desktop and Core do. So, we might need to special case that here. This is the API we use:
Regarding memory use, MMF uses virtual memory which could come from RAM or pages; it should not be using memory from the process's (devenv.exe) private memory but will in the address space. While this will use memory, it lowers the memory pressure for the actual process. |
I think this isn't wrong. open System
open System.IO.MemoryMappedFiles
let create1GB () =
let size = 1024 * 1024 * 1024 // 1gb
let mmf = MemoryMappedFile.CreateNew(null, int64 size)
let view = mmf.CreateViewStream()
(mmf, view)
[<EntryPoint>]
let main argv =
let tenMMF =
Array.init 10 (fun _ -> create1GB())
Console.ReadLine() |> ignore
Console.WriteLine(tenMMF)
0This will explode because of the address space for MMF regarding "views" because it is over 2GB in address space. |
|
I've marked this as approved. I don't mind if the MMF is in process address space if its still a good way to store the data compactly outside the. NET heap (and we could always move it to a temp file?) I'll leave it for you to decide though |
…8339) * Added ItemKey.fsi/fsi. Added blank SemanticClassification.fs/fsi. * Raise disposed exception * Re-worked semantic classification. Renamed ItemKeyReader to ItemKeyStore. Exposing ItemKeyStore/Builder * Fixing build * Storing semantic classification * Caching semantic classification * Wiring it up * Need to fix lexing * Added experimental lexing API to handle find all refs syntactic classification from allocating a lot * Added System.Memory * Using Span to check equality without allocating * Allocate less * Fixing build. Reducing more allocations and not using lex filter on lexing tokens. * Remove langversion * Fixed record find all refs * Fixing test * Partial match for active pattern * Feedback changes * Added comment on TcResolutionsExtensions * Creating view accessor when needed in ItemKey. Fixed UnionCase find all refs. * Added comment on warning * Added Range.comparer. Moving opens to top of file * More feedback changes * Added comment on sliding expiration
…8339) * Added ItemKey.fsi/fsi. Added blank SemanticClassification.fs/fsi. * Raise disposed exception * Re-worked semantic classification. Renamed ItemKeyReader to ItemKeyStore. Exposing ItemKeyStore/Builder * Fixing build * Storing semantic classification * Caching semantic classification * Wiring it up * Need to fix lexing * Added experimental lexing API to handle find all refs syntactic classification from allocating a lot * Added System.Memory * Using Span to check equality without allocating * Allocate less * Fixing build. Reducing more allocations and not using lex filter on lexing tokens. * Remove langversion * Fixed record find all refs * Fixing test * Partial match for active pattern * Feedback changes * Added comment on TcResolutionsExtensions * Creating view accessor when needed in ItemKey. Fixed UnionCase find all refs. * Added comment on warning * Added Range.comparer. Moving opens to top of file * More feedback changes * Added comment on sliding expiration
…8339) * Added ItemKey.fsi/fsi. Added blank SemanticClassification.fs/fsi. * Raise disposed exception * Re-worked semantic classification. Renamed ItemKeyReader to ItemKeyStore. Exposing ItemKeyStore/Builder * Fixing build * Storing semantic classification * Caching semantic classification * Wiring it up * Need to fix lexing * Added experimental lexing API to handle find all refs syntactic classification from allocating a lot * Added System.Memory * Using Span to check equality without allocating * Allocate less * Fixing build. Reducing more allocations and not using lex filter on lexing tokens. * Remove langversion * Fixed record find all refs * Fixing test * Partial match for active pattern * Feedback changes * Added comment on TcResolutionsExtensions * Creating view accessor when needed in ItemKey. Fixed UnionCase find all refs. * Added comment on warning * Added Range.comparer. Moving opens to top of file * More feedback changes * Added comment on sliding expiration

The aim of this PR is to optimize find all references and reduce overall memory usage in VS, or potentially other editors depending on their needs.
To accomplish this, we need to stop storing full symbol information for every single file in every project in incremental builder. This will break find all references and rename refactor though, so we have to do a bit of work.
The solution I propose here is as follows:
SymbolKeyfor our symbols.ItemKeyStore, that is a memory mapped file that contains a continuous block of memory of ranges and a string(the symbol key).IncrementalBuilder.ItemKeyStorecan easily query for all locations of a givenItem.Item.Tokenizer, but it's quite complicated and allocates A LOT which will slow find all references considerably and increase memory pressure; neither of what we want. Fixing it might be more trouble than it's worth at the moment.Both the storage of symbol keys and semantic classification in incremental builder will be disabled by default.
This design isn't perfect; I would rather not store
ItemKeyStoreand semantic classification in incremental builder, but at the moment it is the path with the least resistance. I think we can resolve that by, as a public API callback, intercept check results whileIncrementalBuilderis checking files. That way it is up to the callback to determine what we do with the information, leaving it out of incremental builder's responsibility. But, I feel a little awkward doing that.I will be porting over a lot of work that was done in a prototype and this PR will be the real thing.
The prototype showed significant memory reduction in VS, even without calling find all references, due to lack of storing full symbol information in-memory. Find all references's performance also significantly improved for large solutions.