-
Notifications
You must be signed in to change notification settings - Fork 395
introduce sysregistriesv2 #429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I had a close look at how we can proceed. With the current implementation of the I see three options:
I am in favor of 2), as we add sanity checks, parsing of URL's, change from string to |
|
I am for you option 2, gives us a way forward, as long as we don't have to update all packages at the same time. |
|
yeah #2 is fine to me as well. |
b1a645d to
8b72dff
Compare
|
@rhatdan @baude The latest commit is a concrete suggestion. I did not add any unit tests yet as I want to wait for your opinions and for an agreement on the design. The // FindRegistries returns all registries matching the specified image reference.
// In case of an unqualified image reference, all registries that are configured
// for unqualified image search are returned. In case of a fully qualified
// image reference, the registry with the longest prefix is returned. An empty
// array is returned in case no matching registry is found.
func FindRegistries(imgRef reference.Reference, registries []Registry) ([]Registry, error) |
8b72dff to
6088795
Compare
|
I like the direction you are headed ... good work |
Out of the three alternatives, I prefer 2. as well. But I’d rather have a single implementation, not two that will inevitably drift and differ. Either provide them from a single package, share the underlying implementation in an |
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments on the code, but the two major questions are: (This may have been already discussed elsewhere; feel free to point me to that, I may have missed that or forgotten.)
Does this need to be coordinated with the moby/moby PR introducing similar functionality? Should this me merged independently, or only after the two arrive on some shared consensus?
Is there any chance at all of sharing the file between the two? Would we rely on the atomic-registries conversion? Or not even that?
The API change is not the big worry, we can deal with that. This completely changes the format of the configuration file, and affects users. Is everyone OK with that? @rhatdan @baude
Or do we need to still accept the old input and transparently convert?
|
|
||
| // RegURL represents a Registry's URl. This abstraction is required to | ||
| // unmarshal the non-primitive and non-local url.URL type when loading the | ||
| // toml config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s awkward to expose this implementation detail to callers. Could this be a private type, used in a private clone of the Registry type, used only for the unmarshaling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
|
||
| // String returns host:port/path of the RegURL's URL. | ||
| func (r *RegURL) String() string { | ||
| return r.URL.Host + r.URL.Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s extremely surprising that this throws away the rest of r.URL. Is that necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Do the String methods even need to exist? It may be better to force the caller to think about formatting than being surprising about it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's currently used to match an image reference against registries, but this doesn't need to be public at all.
| URL RegURL `toml:"url"` | ||
| Blocked bool `toml:"blocked"` | ||
| Insecure bool `toml:"insecure"` | ||
| Search bool `toml:"unqualified-search"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean/do? Please add documentation to the field. (Others could benefit from that as well, but for the new ones it’s especially necessary to record the intent and consensus.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In v1 of the API, search registries are used to allow unqualified image pulling. When pulling an unqualified image, the image will effectively be appended to all specified search registries. The first hit wins. The Search field does exactly that. If a registry has this switch set to true, then the registry (and it's mirrors) will be used to pull unqualified images.
| Blocked bool `toml:"blocked"` | ||
| Insecure bool `toml:"insecure"` | ||
| Search bool `toml:"unqualified-search"` | ||
| Prefix string `toml:"prefix"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean/do? Please add documentation to the field.
(And about url = "foo"; prefix = "bar"? Is that allowed? Reasonable?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add documentation. Sorry, in case that's confusing. Some discussion happened in IRC.
This is something similar to the moby/moby PR. The Prefix effectively allows to translate one namespace into another. So url = "foo"; prefix = "bar"; would be allowed.
Notice that specifying a prefix is not mandatory.
| // image reference, the registry with the longest prefix is returned. An empty | ||
| // array is returned in case no matching registry is found. | ||
| func FindRegistries(imgRef reference.Reference, registries []Registry) ([]Registry, error) { | ||
| domain := reference.Domain(imgRef.(reference.Named)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the function crashes with a non-reference.Named, why not make the input parameter a reference.Named in the first place?
I’m not sure this is going to actually work, though; c/image/docker/reference.Reference is not the Fedora /usr/bin/docker-like docker/docker/reference package which implements unqualified names as a first-class concept, its the upstream-like one where a reference always implicitly includes a domain. ParseNormalizedNamed will add a docker.io host name; ParseNamed will refuse an unqualified name.
reference.Parse does make it possible to create a domain-less value, but that value would be dangerous to pass to any other user in c/image which expects normalization to happen. And it incorrectly parses notlocalhost/busybox as domain=notlocalhost, path=busybox, instead of domain=unknown, path=notlocalhost/busybox.
[The search semantics is really problematic for security: with it, and a pull fedora command, the user never knows what image will be downloaded.]
If this needs to exist at all (which, sadly, seems to be the case), the imgRef should probably just be a string; reference.Named would only be created by the result of combining the domain and the input string.
And maybe the public API should not combine the unqualified/qualified cases at all; AFAICT the caller needs to treat them differently anyway (for unqualified, combine the search registry and the input; for qualified, use the input as specified, and only use the flags), so the caller can just as well call two different public API endpoints here (GetUnqualifiedSearchRegistries/GetRegistryForReference or something like that); then, the qualified case could return a *Registry instead of an array, and the caller would not need to worry about what happens when there is >1 element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the detailed feedback; very helpful input. I agree that splitting it up into two functions is a good way to tackle the problem.
| } | ||
| } | ||
| if prefixLen != 0 { | ||
| regs = append(regs, reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Non-blocking: The regs variable seems ultimately unnecessary; it has either zero or one member, so a return here would do just as well.)
| for _, reg := range registries { | ||
| // default to the URL if Prefix isn't specified | ||
| if reg.Prefix == "" { | ||
| reg.Prefix = reg.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function modifies its input, that needs to be clear from its name and description.
| if mir.Prefix != "" { | ||
| logrus.Warnf("specified prefix for mirror '%s' has no effect", mir.String()) | ||
| } | ||
| if mir.Insecure != reg.Insecure { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really undesirable? A private mirror on a private network is exactly the kind of thing many (sure, irresponsibly too many) users run with self-signed certificates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's undesirable. I have such a thing running 24/7 🤣 But to avoid accidents I think that it's worth throwing a warning.
| return fmt.Errorf("mirror '%s' of registry '%s' cannot be mirrored", mir.String(), reg.String()) | ||
| } | ||
| if mir.Prefix != "" { | ||
| logrus.Warnf("specified prefix for mirror '%s' has no effect", mir.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner to add a separate Mirror type than to try to handle these inconsistencies in behavior after the fact. It really isn’t a recursive structure, so let’s not declare it like one.
Already this consistency check is costing about as many lines as reusing the Registry type has saved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
It would be ideal to go in the same direction and share the configuration at best, but I do not see any realistic chance. The moby/moby PR is blocked by upstream. There have been multiple attempts over the past years to implement registry mirroring but all have been silenced with blinding grenades. The current one came far, but I am losing hope it will be merged. The current status is to wait until upstream has an agreement on what they actually want. Hence, I suggest paving the way forward here. We use mirroring already in production (for Docker) and backport the upstream PR. As we're working on offering crio as a tech preview, having the necessary pieces upstream here would be perfect.
It's just my 2 cents but I would not accept the old input anymore. The old API is based entirely on strings, so a clean cut and clean separation of v1 and v2 of the API may just avoid confusion. |
a5ba20f to
742f597
Compare
|
@mtrmac @baude I just updated the PR, with the following changes:
That's basically it. One thing I am not very sure about is the |
Again, I don’t care so much about the API; that affects maybe 10 people maintaining the Golang programs that import this package. I care about the file format as an end-user interface; the worst case here is ignoring, or rejecting, configuration files managed by someone’s config management system, and bringing a few (thousand?) 100-node clusters down. The file format and the API are not the same thing. @rhatdan @baude This is probably up to you (or does someone else own the |
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // registries. | ||
| type Mirror struct { | ||
| // The mirror's URL | ||
| URL tomlURL `toml:"url"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This use of tomlURL is still visible to consumers (in that they have to say Mirror.URL.url - and is the private url field even visible?
I was thinking something like
// All existing docstrings [this is the API provided to Golang callers]
type Mirror struct {
URL url.URL
Insecure bool
}
// [this is the raw format of the config file]
type tomlMirror struct {
URL tomlURL
Insecure bool
}unmarshal into tomlMirror, and then initialize a Mirror from it. (Maybe the toml package supports a nicer way to achieve the same thing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into this.
| URL tomlURL `toml:"url"` | ||
| // If true, pulling from the registry will be blocked | ||
| Blocked bool `toml:"blocked"` | ||
| // If true, certs verification will be skipped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… and HTTP (non-TLS) connections will be allowed.
(for both Insecure members.)
| Search bool `toml:"unqualified-search"` | ||
| // If it prefixes an image, the registry will be used for pulling | ||
| // (defaults to the URL without the URI scheme) | ||
| Prefix string `toml:"prefix"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the Prefix used only for matching, or stripped? E.g. with URL=https://example.com/foo and Prefix=example.net/bar, is example.net/bar/baz downloaded from https://example.com/foo/bar/baz or from https://example.com/foo/baz ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Prefix is used only for matching. If we have URL=https://example.com/foo and Prefix=example.net/bar, then it depends on what the user of the API decides to do. I would suggest downloading baz:latest from https://example.com/foo/baz:latest. If we want to have the other mapping, we are free to configure URL=https://example.com/foo/bar and Prefix=example.net/bar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then it depends on what the user of the API decides to do
I’d much prefer having it explicitly defined. Having two different applications on the same machine interpret the same configuration file differently would not do.
Your example shows stripping the prefix, and you’re right that that gives the user more freedom. So, just explicitly document that, please.
| // if no prefix is specified default to specified URL without the URI scheme | ||
| for i, reg := range config.Registries { | ||
| if reg.Prefix == "" { | ||
| prefix := strings.TrimPrefix(reg.URL.url.String(), reg.URL.url.Scheme) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn’t this strip only http from http://example.com, leaving prefix set to ://example.com?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I will check this. Once we have an agreement on the design, I will make sure to add unit tests.
Still, the “insecure” name is not perfect, e.g. it’s pretty non-descriptive in what it does. I’m not sure that there is any other short and descriptive alternative, and the “insecure” naming is at least somewhat traditional for distribution/docker registries by now. I don’t feel strongly about this, up to @rhatdan / @baude again. |
420981e to
8eeadab
Compare
|
@mtrmac I have updated the PR.
@rhatdan @baude PTAL. To continue, we need to agree on how an upgrade path from v1 to v2 should look like regarding backward compat, etc. |
Thanks; looks good to me.
At least for I don’t have a strong opinion about I’m not entirely sure about forcing the caller to keep a |
|
I'm cool with this. It's V2 so immediate impact is minimal. @rhatdan final approval by you? |
That’s a name of the Go package, not of the config file. This proposes a completely different format of |
8eeadab to
72d6fe5
Compare
|
@rhatdan I have just added another commit adding the backwards compatibility to a v1 config. The commit includes more unit tests. @mtrmac @baude @rhatdan If I see it correctly, the last "open" topic is the signature of the API functions: Should the config get loaded for each call? Should we operate on a |
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! The backward compatibility eliminates my only major worry.
| regs[reg.Prefix] = up | ||
| return | ||
| } | ||
| reg.Mirrors = []Mirror{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems to be a better fit with the rest of the intialization in makeRegistry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| reg.URL = url | ||
| reg.Prefix = stripURIScheme(url) | ||
| return reg, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be simpler, roughly like:
regs := make(map[string]*Registry)
getRegistry := func(s string) (*Registry, error) { // Note: _pointer_ to a long-lived object
prefix = …
reg, exists := regs[prefix]
if !exists {
reg := &Registry{}
// initialize reg
regs[prefix] = reg
}
return reg, nil
}
…
for _, search := range config.V1Registries.Search.Registries {
reg, err := getRegistry(search)
if err != nil {
return []Registry{}, err
}
reg.Search = true
}notably all of the || updates go away, and we don’t need to worry about what parts of reg have been discarded in the exists case of addRegistry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your great review. This one made the code much simpler and more readable.
| for _, search := range config.V1Registries.Search.Registries { | ||
| reg, err := makeRegistry(search) | ||
| if err != nil { | ||
| return []Registry{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Non-blocking nit: error cases like this could be return nil, err throughout.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| } | ||
|
|
||
| // backwards compatibility for v1 configs | ||
| v2Registries, err := getV1Registries(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn’t this be v1Registries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch :)
| assert.NotNil(t, reg) | ||
| } | ||
|
|
||
| func TestV1BackwardsCompatibility(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also preserve the original, ~unchanged system_registries_test.go as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extended the v2 ones a bit, which test more cases now, so I don't think we need that.
All good points, let’s leave the load+filtering separate then.
I don’t feel strongly about this. It would be a tiny bit simpler for the caller to have an opaque “state”, but e.g. dumping the full configuration for debugging would be easier with the full data available. Either is fine with me. |
40e911e to
45e2af9
Compare
a81beb1 to
a0856d5
Compare
|
The PR is updated including @mtrmac's latest comments. I stashed the changes into one commit with a proper description. The CI has some hiccups from skopeo at the moment. |
|
👍 Thanks! The code LGTM.
Sorry about that, I left the two repos inconsistent overnight. I have restarted the tests now. @rhatdan / @baude can you ACK the new format, please? Also, RFC on the transition plan. This introduces a new subpackage, meaning that any consumer can migrate at its leisure, but also nothing is pushing them to migrate. Alternatively we could replace the existing |
|
@mtrmac Wonderful, thanks a lot for your great reviews. I really like the outcome. My plan would be to update libpod soon to make use of the new format in order to support mirrors. I suggest to use this as a first user, see how things are going and once that's done and "known to work", we can port other tools. What do you think? |
|
LGTM |
|
@vrothberg Please rebase one last time and I’ll merge this. |
a0856d5 to
f759d78
Compare
|
@mtrmac I just rebased on the latest master. Thanks! |
|
(The test failure is almost certainly unrelated; I may not be able to debug it today I’m afraid.) |
|
/test all |
|
@vrothberg Could you just amend you PR and push again to retrigger tests. |
Introduce sysregistriesv2, which changes the format of the `registries.conf` TOML configuration. Instead of having different lists to specify search registries, blocked and insecure ones, all data is encapsulated into one registry type. The registry type allows to specify a list of mirrors, which can be used in the endpoint lookup to serve, for instance, as pull through caches for the associated registry. An example configuration may look as follows: ```toml [[registry]] url = "https://registry.com" prefix = "another-registry.com" [[registry.mirror]] url = "http://registry-mirror.com" insecure = true ``` The upper example shows the configuration for `https://registry.com`. The prefix is used for matching images, and to translate one namespace to another. If `prefix="example.com/bar"`, `url="https://example.com/foo/bar"` and we pull from `example.com/bar/myimage:latest`, the image can effectively be pulled from `example.com/foo/bar/myimage:latest`. If no prefix is specified, it defaults to the specified URL. Ease migration from sysregistries v1 to v2 by also loading configurations in the v1 TOML format. Throw an error in case a config tries to mix both formats. This allows a smoother migration for developers, maintainers and the user, who can switch to the new config format once all tools have been updated to v2. Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
f759d78 to
633422b
Compare
|
@rhatdan Done. |
|
CI seems to be stable now. @mtrmac, can you approve this PR? |
Introduce sysregistriesv2, which changes the format of the
registries.confTOML configuration. Instead of having different liststo specify search registries, blocked and insecure ones, all data is
encapsulated into one registry type. The registry type allows to
specify a list of mirrors, which can be used in the endpoint lookup to
serve, for instance, as pull through caches for the associated registry.
An example configuration may look as follows:
The upper example shows the configuration for
https://registry.com.The prefix is used for matching images, and to translate one namespace
to another. If
prefix="example.com/bar",url="https://example.com/foo/bar"and we pull from
example.com/bar/myimage:latest, the image willeffectively be pulled from
example.com/foo/bar/myimage:latest. If noprefix is specified, it defaults to the specified URL.
Ease migration from sysregistries v1 to v2 by also loading
configurations in the v1 TOML format. Throw an error in case a config
tries to mix both formats. This allows a smoother migration for
developers, maintainers and the user, who can switch to the new config
format once all tools have been updated to v2.
Signed-off-by: Valentin Rothberg vrothberg@suse.com
Closes: #397