Skip to content

Conversation

@nguerrera
Copy link
Contributor

@nguerrera nguerrera commented Aug 15, 2022

Summary

  • Make syntax trees deeply readonly in their TS typing.
  • Remove mutation of symbol tables contained in syntax trees in checker.
  • Reuse syntax trees for unchanged files in language server.
  • Reuse entire program when no files have changed in language server.
  • Fix other miscellaneous perf issues found while profiling the impact of this change:
    • Avoid repeat loading of config file
    • Avoid repeat creation of schema validators
    • Avoid unnecessary parse in findDocumentHighlight.
  • Add environment variables to log language server call timing and/or save .cpuprofile files to help with future perf work in this area.

Performance

I'm seeing approximately 2X less time spent in language server for typing the same sequence into petstore.cadl and navigating around with goto-definition, document-highlight:

image

I was hoping more than this. It turns out that we still lose a fair amount of time in I/O as the file system cache doesn't prevent us from hitting the disk walking directories, realpath'ing, and so on. We could go after that next. We currently have to process all of the imports before we can decide that a program is unchanged and that's where we incur this I/O. There's no reading of unchanged file content, but there is reading of unchanged directory structure.

Here is document-highlight cpu profile with a cache hit on the whole program. Almost all of the time is spent in realpath, stat, etc.:

image

I filed #894 to try to make the cache smarter to deal with this, but I don't think it's high priority. You will likely need a lot of files/imports to perceive it.

It also becomes apparent that parsing is already quite fast and the features that were only parsing and visiting the syntax tree for a single document (e.g. semantic-tokens, folding) didn't have much to gain for reasonably sized documents. The biggest win is on features like goto-definition, find-references, document-highlight that run on unchanged programs but need a full compilation.

Details

  • We used to inject symbols into symbol tables hanging off syntax trees to implement using statements including the implicit using Cadl and member references. To avoid this, we now do a copy-on-write to the symbol tables. This adds an extra map lookup for each call to resolveIdentifierInTable, where we first look for a copy of the table and then do the actual lookup. No noticeable slowdown was observed in profiling based on this. It does double the time in resolveIdentifierInTable as 1 lookup becomes two, but this does not make a significant difference overall and I think it's a worthwhile trade to get the reuse.

  • createProgram gains a new oldProgram argument from which we will reuse syntax trees if the host gave us identical SourceFile instances. We will also reuse the entire oldProgram and skip checking if we find that the compiler options and source file set are unchanged. Language server keeps track of last compilation for a given entry point to provide oldProgram. Over time, it's possible that we can reuse more of oldProgram in more cases.

  • CompilerHost gains a new optional parseCache that allows sharing syntax trees between programs even if oldProgram is not supplied or the oldProgram didn't have a particular source file. This is particularly useful for language server features that do not need an entire program but just the syntax tree of the current document. The language server also uses this.

  • ParseOptions (currently include comments or not) can now be specified on CompilerOptions. Language server uses this to always include comments so that features that need comments can share trees with features that don't.

Fix #623
Fix https://github.com/Azure/cadl-azure/issues/619

@azure-pipelines
Copy link

You can try these changes at https://cadlplayground.z22.web.core.windows.net/prs/868/

@nguerrera nguerrera force-pushed the reuse-parse-and-bind branch from 0142f0e to 71c8640 Compare August 15, 2022 18:37
@nguerrera nguerrera force-pushed the reuse-parse-and-bind branch from 4dc4d7b to aa84bb1 Compare August 22, 2022 18:48
@nguerrera nguerrera marked this pull request as ready for review August 22, 2022 19:04
@nguerrera nguerrera changed the title [WIP] Make syntax tree typing deeply readonly and reuse them in language server Make syntax tree typing deeply readonly and reuse them in language server Aug 22, 2022
}
}

// (nicholg) Why do we have both compile and createProgram?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo? should we just remove compile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, TODO. Didn't want to make a breaking change as part of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the comment and file an issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #914 and removed the comment.

@@ -1,5 +1,6 @@
import type { JSONSchemaType as AjvJSONSchemaType } from "ajv";
import { Program } from "./program";
import { JSONSchemaValidator } from "./schema-validator";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import { JSONSchemaValidator } from "./schema-validator";
import { JSONSchemaValidator } from "./schema-validator.js";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer applicable as we define it in types.ts instead of importing it.

@nguerrera nguerrera merged commit 33c2c4d into microsoft:main Aug 24, 2022
@nguerrera nguerrera deleted the reuse-parse-and-bind branch August 24, 2022 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Some parts of syntax nodes are still typed writable

2 participants