Seperate out importing code by bagel897 · Pull Request #208 · lyz-code/autoimport

bagel897 · 2022-03-30T15:28:59Z

I'm working on creating a plugin for autoimport for pylsp and would find it useful if we could get a list of all possible imports without formatting the file so we can implement autoimport when you autocomplete, not on formatting. I split off the relevant parts off to the SourceCodeBase class and exposed two new functions - find_package and get_all_packages. I don't need find_package, but it was largely implemented and could lead to a better solution if get_all_packages doesn't work.

Checklist

Add search for modules to get_all_packages
Implement caching for get_all_packages.
Add test cases to all the changes you introduce
Update the documentation for the changes

lyz-code · 2022-03-31T07:50:03Z

Thanks @bageljrkhanofemus , thanks for starting the contribution.

As far as I can see right now it looks good. Can't wait to see it finished :)

Maybe you're interested in #39 too

bagel897 · 2022-04-01T05:25:12Z

So a little update, I'm working on trying to find out what modules are installed, and wanted to see how rope handles this part. So it turns out Rope has their own implementation (with caching and a similar featureset) here.

It has the same issue, it doesn't automatically find standard library modules
It can't handle "Resources" outside of the ropeproject

However, it can determine the python path(s) of the current project, which will make that part easier.
I think the best approach is to de-duplicate the work, use the rope autoimport code as a backend (if/once I can make it detect standard modules) and replace the actually finding imports here with that.

lyz-code · 2022-04-01T08:13:52Z

Hi @bageljrkhanofemus I've got some doubts, please bear with me as it's been some time since I dove into autoimports code:

I haven't used LSP before, why do you need to run autoimport at autocomplete? will that update the file when you try to autocomplete a variable?
I've skimmed through the code of the rope project, it looks good, but beside delegating in them the code I don't see the benefits of using their backend. Can you share your thoughts on the strong points of this approach?
As far as I remember, we currently know what modules are installed, what is the problem here?
I don't see any installation guide of rope, would it mean that we add them as a library dependency? or will the users need to install additional software to make autoimport work?

What I do see is a great idea is:

Build a cache of the installed packages and the project objects.
Build a tree of the project files so that we can implement relative imports

I'm very excited of your ideas and implementation, please keep me updated

bagel897 · 2022-04-01T13:39:24Z

Autoimport while writing code

So say I'm writing code, and want to include typing's List class in my code. The current workflow is

I type out the List manually
I get a undefined name error
I save, formatting the code, triggering a run of autoimport (automatically via pylsp-autoimport or manually via autoimport)

A better workflow would be to

Determine what modules are available (possibly with caching)
When the user is typing, they'll see the option for autocompleting List, with the description that it imports from typing
When the user selects it, it finds out what line of code it needs to place the import statement and does so, without saving the file.

How do I do that

Autoimport does store a list of all available imports for the project, typing, and common statements. However, for a PYTHONPATH module search, it uses importlib to determine which python files we can use. Since we can't assume autoimport is running from the same path as the project its working on, this is unsatsifactory for 2 reasons.

It may have issues if autoimport and the project have different modules available, which will almost certainly be the case
It can't give us available imports ahead of time (as far as I can tell)

So what the code needs to do is

find the pythonpath of the project (Rope does this by storing it in preferences)
find the list of modules in each python path and watch for changes (WIP)
Cache the list of available imports for each modules (both your extract package objects and rope can do this)
Return the correct import for the given name for postfixing, or all imports for autocompletion
cache the result (rope can mostly do this)

So rope has a couple of advantages for embedding in pylsp, namely:

rope is already an optional dependency of pylsp
rope is designed to be embedded and has been embedded in pylsp
it has the ability to watch for changes in the project (which can probably be expanded to external modules)
it has less dependencies than autoimport (it just depends on itself and python and its installable via pip)
It has information on project files and their locations relative to each other (though I'm not sure if it uses that currently).

Progress so far

I got rope to cache all the packages in my site-packages (took like 3 minutes)

I got autoimport to do something similar, but it seems slower

lyz-code · 2022-04-01T14:05:10Z

Thanks for the explanation it was cristal clear.

I see your point on the need of being able to run autoimport at autocompletion time, I'm thrilled to see the outcome of this PR, it looks like you're giving the package an awesome twist, thank you.

I'm curious too to know how does rope manage the cache and how does it watch the changes in the pythonpath.

We currently find the PYTHONPATH with pyprojroot, I don't know if it can fulfill your needs. I wouldn't have any problem ditching it in favour of rope.

Thanks again :)

bagel897 · 2022-04-01T18:19:18Z

While that is useful (especially since rope needs that anyway), it doesn't give us the full information
Our imports basically come from 2 sources

Items inside the project
a) we can find the project location with pyprojectroot
b) rope needs to know the project location
Items found from installed modules and python standard library which actually consists of multiple paths (IE: /usr/lib/python3.10, /usr/lib/python3.10/site-pacakges, local folders, venvs, etc). Rope uses a configuration file and appends it to sys.path to get this

Rope manages a cache for the project itself, watches for changes and updates that accordingly (rope's codebase is larger so I haven't read through the whole thing yet) and it stores its cache in a .ropeproject folder.
However, it's cache doesn't handle modules being removed/added.
I've got a basic layer where autoimport uses rope under the hood. It has a better module name parser and is faster at parsing the entire stdlib. I'm working on how to store/update the data for external modules in a convenient manner.

lyz-code · 2022-04-05T09:24:39Z

I've given it a little bit of thought and this is what I came out with:

To make the autocompletion quicker, we could split the reading of cache and the update. So when we use the autocomplete it will only fetch items from the cache, and when it saves it updates the cache. Another option would be to only read the cache when saving and autocompleting, and have a daemon in another process that is monitoring the file changes and update the cache then. An alternative to rope's system could be using inotify to monitor the file changes.

Right now the vim's integration with Ale runs autoimport in a temp file in tmp so it's difficult to know what project does that file belong to. After seeing other integrations (flakehell, pylint, mypy), there's a possibility to run ale with a cwd argument that specifies the project's root directory. So we can assume that we know the original file path.

bagel897 · 2022-04-05T16:41:12Z

inotify would only work when you're editing the file. Say I do this

Install tensorflow (large example module)
Start editing
Stop editing (and autoimport)
uninstall tensorflow or update tensorflow
Start editing again

I don't know a good way to watch for 4 besides hashing the directory or something.
Right now I'm working on the rope version. At a minimum, the AST parser is much quicker than the inspect based one and avoids importing modules.
As for embedding, I mean to embed rope directly to pylsp, but will still try to contribute any improvements here (as a CLI interface).

lyz-code · 2022-04-06T09:52:21Z

One way of monitoring 4 could be by package manager hooks, I'm currently using PDM, and it looks like it's easy to write a plugin that updates the cache on package changes. I will add it as an optional feature so as not to force the users to use PDM

lieryan · 2022-04-07T07:53:03Z

Simplest way is to path.stat() and check for filesystem metadata (stat_result), specifically the st_mtime (modification time) and st_size (file size), and maybe the st_ctime (creation time) too. That's how Git quickly detects file changes.

It's not fool-proof since mtime can be modified manually, but it'll be much faster than any methods that reads off the file content like hashing the whole directory and it should work for the vast majority of the time.

Compared to package manager hooks, it would not require modifying the virtualenv directory nor for the editor's plugin virtualenv to be the same as the project virtualenv.

Compared to inotify, saving file metadata would not require a deamon that needs to remain alive while the files/folder was being modified, so when using inotify, you will still need a way to find changes at startup since the daemon shuts down. inotify is faster than metadata scanning though, so for when you already have a persistent daemon (e.g. pylsp), inotify can be a good optimization.

bagel897 · 2022-04-07T16:47:25Z

Woah that's smart, I'll add it to my code for updating packages. That's going to be way faster than hashing anything.

bagel897 · 2022-05-25T12:19:08Z

Update:
I've been working on a rope autoimport backend, which analyzes all the available modules and will try to find names in them. In rope 1.1.1, it now works and can be used. I wrote code using that for pylsp for interactive. I wrote a basic wrapper for the rope version in here. However, it has several disadvantages:

It will have significantly more imports to choose from, since rope-autoimport scans all available packages. Unlike pylsp, the cli tool cannot choose
It will be slower, because it has to index all these modules
They have different goals and therefore performance needs - CLI knows whats missing ahead of time, pylsp doesn't.

The first two can be solved by scanning the pyproject.toml for dependencies and using those (which I will eventually implement in rope). However, the goal of that project is interactive autoimport in pylsp and I will not finish this PR since it is out of scope.

Thank you for your patience and help.

lyz-code · 2022-05-26T12:34:25Z

I'm so sad to hear that @bageljrkhanofemus , I completely understand though. Can you please add a link to your project so that if anyone want's to continue this initiative can take it as a reference?

bagel897 · 2022-05-26T13:06:32Z

Most of the work is in rope's new sqlite autoimport. Its documented in docs/library.rst and docstrings in source. Ideally, this will be moved to RTD. If there are bugs/features, file an issue and tag me.
This PR has a partially working implementation which someone could use as a base. I haven't touched it in a while but the API will not have changed.
Add a plugin to provide autoimport functionality python-lsp/python-lsp-server#199. Is another front-end for rope autoimport with some features like more advanced import sorting.

bageljr added 2 commits March 30, 2022 10:17

refactor code to allow standalone finding of imports

b49efa5

make find_packages public

865ee89

remove shadow variable

112ab60

bagel897 added 3 commits April 8, 2022 13:53

switch completely to rope backend

6472117

remove unnessecary services

09ec950

pass some tests

bec34e4

lyz-code mentioned this pull request May 12, 2022

Import objects by mimicking other modules #212

Closed

bagel897 added 2 commits May 25, 2022 17:40

use up-to-date rope

26f0fd4

relock, use sqlite

b64a042

bagel897 closed this May 25, 2022

lyz-code mentioned this pull request Jul 14, 2022

Search object in all available packages #39

Open

Conversation

bagel897 commented Mar 30, 2022

Checklist

Uh oh!

lyz-code commented Mar 31, 2022

Uh oh!

bagel897 commented Apr 1, 2022

Uh oh!

lyz-code commented Apr 1, 2022

Uh oh!

bagel897 commented Apr 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Autoimport while writing code

How do I do that

Progress so far

Uh oh!

lyz-code commented Apr 1, 2022

Uh oh!

bagel897 commented Apr 1, 2022

Uh oh!

lyz-code commented Apr 5, 2022

Uh oh!

bagel897 commented Apr 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lyz-code commented Apr 6, 2022

Uh oh!

lieryan commented Apr 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bagel897 commented Apr 7, 2022

Uh oh!

bagel897 commented May 25, 2022

Uh oh!

lyz-code commented May 26, 2022

Uh oh!

bagel897 commented May 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bagel897 commented Apr 1, 2022 •

edited

Loading

bagel897 commented Apr 5, 2022 •

edited

Loading

lieryan commented Apr 7, 2022 •

edited

Loading

bagel897 commented May 26, 2022 •

edited

Loading