repo-updater: Use URI field as fallback for repos.GetByName#3922
repo-updater: Use URI field as fallback for repos.GetByName#3922keegancsmith merged 9 commits intomasterfrom
Conversation
|
@chrismwendt see my comment on go-langserver in the description. Does that make sense? |
External tools (such as the browser extension, language extensions, editor extensions) compute the Sourcegraph repository name using the repository's hostname and path. For example `github.com/gorilla/mux`. However, if an administrator configures an external service with the non-default `repositoryPathPattern` the name on Sourcegraph the name of a repository on Sourcegraph will not match with the externally computed name. This commit will store the default name in the unused URI field on the repo table. When the frontend is looking up a repository by name, it will now fallback to the default name (the URI column) to find a repository. This magically makes most external tools just work. Additionally the frontend has redirect logic where if a repository path does not equal its name, it is redirected. There are cases where extension relies on the name to be the default name. For example the go language server uses the name as the default root go package path. To fix this use-case we will need to extend the extension to pass on the `URI` field of the repo. However, things like fetching the file contents will still work, so if the root path is found in other ways (such as parsing a config file or import package doc) it will still work. The initial implementation of this feature returned errors indicating the correct name to use. However, that would require every client to handle that case. Instead this implementation works as expected, and in a few cases requires some clients to support this model. Note: the clients that need updating are no more broken than before, and generally work better now. The URI field was picked since this is what its original value was. Additionally it is an unused field. I considered giving it a new name, but prefered consistency between the name in the database and the name in the code.
57a3e99 to
1c327a7
Compare
Codecov Report
|
mrnugget
left a comment
There was a problem hiding this comment.
Straightforward and easy to understand solution, I like it 👍
That being said, this is easy to understand on a technical level, but I had some troubles with the naming. Am I correct in assuming that the URI field contains something like the "original path" of a repository? (I also had to lookup again whether URIs contain a scheme or not and apparently I'm not the only one confused by the RFC 😄 )
I left two suggestions for changes and I also agree with @tsenart that we should have some tests for this, especially since tests would showcase what a URI looks like for different code hosts
@mrnugget: What I looked at was https://github.com/sourcegraph/sourcegraph/blob/master/cmd/repo-updater/repos/testdata/sources/AWSCODECOMMIT/yielded-repos-have-authenticated-CloneURLs.yaml#L51 and https://github.com/sourcegraph/sourcegraph/blob/1c327a778408796499b6c89d11ea4d36b1fe4eca/pkg/conf/reposource/awscodecommit.go#L29:6 The names are bare names as far as I can see, without any hostname (e.g stripe-go, no github.com/stripe/stripe-go) |
URIs indeed is a bad name for this. However, we are using it for convenience since our repo table already had this column (which was also used incorrectly). I somewhat explained this in the description. I'm happy to add a DB migration adding a column and giving it a better name if you think that is appropriate. How about |
After reading up on URIs vs. URLs again I actually thought that URI is the correct name for it, in the sense of "this is the the unique resource identifier, according to the code host".
I think "canonical name" is a great name. Does it need a migration? I don't know, you decide :) As a newcomer to the codebase and somewhat to the domain knowledge encoded in repo-updater I'm hesitant to decide between "code could be clearer" and "I just don't know the whole thing yet" |
|
I think that URI is a good name for what it should represent. |
(keegancsmith) Cleanup bitbucket code such that we don't need toThis comment was generated by todo based on a
|
The Go extension or go-langserver aren't broken because of this, are they? Currently, both the Go extension and go-langserver assume the Sourcegraph repository name is the same as the host+path of the clone URL (and go-langserver gets file contents from the Sourcegraph raw API). It doesn't seem like that use-case is supported. When you say "To fix this", does that mean "in order to support this in the future"? |
Less broken than before. We have always supported repositoryPathPattern, but if an admin used it the extensions would often break. Now they will work in more cases.
Yeah, to support this in future. Previously if an admin used a different repositoryPathPattern the assumption in go-langserver would fail. I am suggesting we can now always send host+path as an init option, which can then be used for root package path detection. Note: Before things like the Raw API would of worked. It would of failed though to fetch dependencies via the Raw API since the host+path would of failed against sourcegraph. With this PR, it would work though! |
Sweet ✨
Currently, these are the
Are you suggesting changing that? Coincidentally, I'm changing this in order to make go-langserver more LSP-conformant in sourcegraph/go-langserver#369 (but I'm also trying one other approach, PR pending). Want to chat about this?
Sweet again 😄 |
FYI when we added those |
External tools (such as the browser extension, language extensions, editor
extensions) compute the Sourcegraph repository name using the repository's
hostname and path. For example
github.com/gorilla/mux. However, if anadministrator configures an external service with the non-default
repositoryPathPatternthe name on Sourcegraph the name of a repository onSourcegraph will not match with the externally computed name.
This commit will store the default name in the unused URI field on the repo
table. When the frontend is looking up a repository by name, it will now
fallback to the default name (the URI column) to find a repository. This
magically makes most external tools just work. Additionally the frontend has
redirect logic where if a repository path does not equal its name, it is
redirected.
There are cases where extension relies on the name to be the default name. For
example the go language server uses the name as the default root go package
path. To fix this use-case we will need to extend the extension to pass on the
URIfield of the repo. However, things like fetching the file contents willstill work, so if the root path is found in other ways (such as parsing a
config file or import package doc) it will still work.
The initial implementation of this feature returned errors indicating the
correct name to use. However, that would require every client to handle that
case. Instead this implementation works as expected, and in a few cases
requires some clients to support this model. Note: the clients that need
updating are no more broken than before, and generally work better now.
The URI field was picked since this is what its original value
was. Additionally it is an unused field. I considered giving it a new name,
but prefered consistency between the name in the database and the name in the
code.
Test plan: Tested locally. Once on dogfood we will do plenty of follow up testing to ensure extensions and other integrations work.
Part of https://github.com/sourcegraph/sourcegraph/issues/462