(feature): add direct image registry client Unpacker implementation#145
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main operator-framework/catalogd#145 +/- ##
===========================================
- Coverage 87.02% 49.72% -37.30%
===========================================
Files 3 6 +3
Lines 131 366 +235
===========================================
+ Hits 114 182 +68
- Misses 10 163 +153
- Partials 7 21 +14
☔ View full report in Codecov by Sentry. |
3c0fb59 to
e3ec751
Compare
818e465 to
3aae479
Compare
Unpacker implementation
| // TODO: Add garbage collection to remove any unused | ||
| // images/SHAs that exist in the cache | ||
|
|
||
| // TODO: Make asynchronous |
stevekuznetsov
left a comment
There was a problem hiding this comment.
We don't have one (yet) but I think it's useful to consider writing code for controllers as if there were a long-lived deployment you are on the pager for. Every commit that merges to the codebase deploys, and you need to keep in mind that branch cuts for future releases can happen at any point.
The implication here is that if we were to cut a release of catalogd with this PR, the server would cache image layers forever and not do any cleanup. Prefer when adding new features to a controller to handle (minimally) the full lifecycle.
There are a couple layers of indirection in the codebase so I don't know how the basic intuition translates to this case, but writing the full lifecycle code in a controller context is fairly straightforward.
Maybe I'm misunderstanding, but I thought that is what feature gates are used for? As far as I understand it, the feature gate should signal that a feature is minimally functional (IMO this is - it does what it needs to even if not optimized) but is still a work in progress, is bound to change, and could need optimizations. This feature gate should be disabled by default and therefore this implication shouldn't happen unless the feature gate is explicitly enabled. If I am misunderstanding I am happy to add the garbage collection logic. My intention here was to simply get the basic implementation in and then iterate from there.
I don't think it would be too complex to do a minimal implementation of garbage collection. As I mentioned above, I don't mind adding it if I misunderstood the message sent by sticking this functionality behind a feature gate and that we could get these optimization type things in iteratively. |
|
IMO, it is reasonable to deliver half-baked features behind a disabled-by-default feature gate. It would be good to have a design that enumerates the plan for how the feature is delivered and what the proposed maturity process is for the feature gate. Is there a brief or RFC for this? Seems big enough to warrant one. |
|
I'm less interested in Google Docs than making the development easier. Since lifecycle is the bread and butter of what controllers do, it's usually useful to tackle it (minimally) whole-hog so that you don't make more refactor work for yourself in the future on code you just wrote. |
|
I don't disagree. I'd guess that we may not all agree what our definitions of "minimally" are though. That's where a high-level plan could help achieve alignment on expectations. |
There is not |
1c1ba1e to
e0c4b15
Compare
| image-registry: ## Setup in-cluster image registry | ||
| ./test/tools/imageregistry/registry.sh |
There was a problem hiding this comment.
Have you seen https://github.com/tilt-dev/ctlptl?
Since we're already tilt-enabled, perhaps this is a reasonable tool to evaluate for our cluster setup purposes, especially since it seems to support setting up image registries that are
simultaneously accessible to the local dev environment and the in-cluster processes.
Can we spend 30 minutes now (but time boxed) evaluating this to see if it would be a drop in replacement?
If it is a good candidate, it would be fine to integrate in a follow-up.
There was a problem hiding this comment.
I have seen that. I played around with doing it the kind way and wasn't able to get that working well (there seemed to be resolution issues from the direct client and it couldn't find the registry when using both the names mentioned in the docs). My understanding is that ctlptl pretty much does the steps mentioned in the kind docs for you and would, in theory, have the same results so i didn't give it a shot.
That being said, I'm happy to time box an evaluation of it. I'll circle back around to investigating this after addressing other things since this would be a follow-up anyways
There was a problem hiding this comment.
I gave this a shot and ran into the same issues I had with the approach documented by kind. It just can't seem to resolve the registry URL. Maybe this is just an issue on my machine though? I haven't tested it on my Mac, but I've had kind related troubles in the past on my work laptop running Fedora so 🤷
There was a problem hiding this comment.
Okay, sounds good. Maybe make a tracker for this, even if it's just a way to mentally bookmark this apparent gap in the ecosystem for cluster+registry tooling. It seems like progress is being made, so perhaps we can revisit again in 3-6 months.
There was a problem hiding this comment.
| metadata: | ||
| name: my-selfsigned-ca | ||
| namespace: catalogd-e2e | ||
| spec: | ||
| isCA: true |
There was a problem hiding this comment.
I'm a little confused here. Is this a CA or is this the docker registry's server certificate, or both?
I think what we need is just a self-signed server keypair, and then clients just need the server cert to trust. But maybe I'm missing something.
There was a problem hiding this comment.
My understanding is that it is both. We use the tls.crt and tls.key entries in the generated secret to be used by the image registry and we inject the ca.crt entry to the necessary locations so that clients trust the cert used by the image registry. I tried it some different ways to no avail, but it could just be due to my lack of experience with cert-manager (and certificate management in general)
There was a problem hiding this comment.
Leaving as is for now. This should be addressed by https://github.com/operator-framework/catalogd/issues/188
| spec: | ||
| containers: | ||
| - name: registry | ||
| image: registry:2 |
There was a problem hiding this comment.
This is coming from Docker Hub, which is rate-limited, and can cause flakes. Can we find a registry container in quay (or some other better image registry)?
There was a problem hiding this comment.
We could try to use Project Quay as the image registry but that would require a good bit more effort to investigate how to properly configure this. This image is Apache 2.0 licensed so if we are concerned about the flakes we could probably pull from dockerhub, re-tag, and push to a quay repo?
There was a problem hiding this comment.
Another alternative is writing our own simple registry in Go and building+loading that image onto the kind cluster.
There was a problem hiding this comment.
I think registry:2 is fine for now. I'm not seeing it obviously mirrored into another registry. Perhaps a follow-up issue and/or comment in the YAML about this for now?
There was a problem hiding this comment.
Sounds good. Just to try it out, I testing what it would take to write a super barebones registry in Go. Was able to do it with:
package main
import (
"log"
"net/http"
"os"
"github.com/google/go-containerregistry/pkg/registry"
)
func main() {
certFile, certEnvVarExists := os.LookupEnv("REGISTRY_HTTP_TLS_CERTIFICATE")
if !certEnvVarExists {
if err := http.ListenAndServe(":5000", registry.New()); err != nil {
log.Fatal(err)
}
}
keyFile, keyEnvVarExists := os.LookupEnv("REGISTRY_HTTP_TLS_KEY")
if !keyEnvVarExists {
log.Fatal("environment variable REGISTRY_HTTP_TLS_CERTIFICATE specified but REGISTRY_HTTP_TLS_KEY was not")
}
if err := http.ListenAndServeTLS(":5000", certFile, keyFile, registry.New(registry.Logger(log.Default()))); err != nil {
log.Fatal(err)
}
}I built and loaded an image with this for e2e and it passed when run locally. I won't push this up because it was a quick poc, but something I thought would be worth sharing.
There was a problem hiding this comment.
Follow up issue: https://github.com/operator-framework/catalogd/issues/189
Signed-off-by: Bryce Palmer <bpalmer@redhat.com>
f409103 to
9fea4bc
Compare
| @@ -0,0 +1,97 @@ | |||
| package main | |||
There was a problem hiding this comment.
Not sure this really matters, but I'd suggest (for a follow-up) moving the GC function into a separate package so that we can avoid a unit test in the main package.
joelanford
left a comment
There was a problem hiding this comment.
Left a few more comments about potential follow-ups. But LGTM.
|
It seems that the selected library does not support mirroring. What is the plan in this regard? |
This is a known (and out-of-scope for now) feature that needs to be implemented. The RFC calls this out and says that we'll cover it in a future RFC. |
| AuthNamespace string | ||
| } | ||
|
|
||
| const ConfigDirLabel = "operators.operatorframework.io.index.configs.v1" |
There was a problem hiding this comment.
Can probably switch to using ConfigsLocationLabel from here without creating a new constant variable
| return fmt.Errorf("error parsing remote image %q config file: %w", imgRef.Name(), err) | ||
| } | ||
|
|
||
| dirToUnpack, ok := cfgFile.Config.Labels[ConfigDirLabel] |
anik120
left a comment
There was a problem hiding this comment.
LGTM
The most important follow up to capture for me would be the reassessing of the two interfaces, Unpack and Store, and the use of a cache in Unpack in that context.
| containers: | ||
| - name: manager | ||
| volumeMounts: | ||
| - mountPath: /etc/ssl/certs/ |
There was a problem hiding this comment.
I'd prefer to have this context captured as a comment here somewhere somehow (I'm not sure what the mechanics for using comments in a kustomization files are). Issues aren't available to a reader of the code, comments are, even if the comment is just pointing to the GitHub issue.
Description
Unpackerimplementations with one that communicates directly with an image registryUnpackerimplementationSecrets in thecatalogd-systemnamespace. This is so we can pull images that require authentication by using a pull secretMotivation
UnpackImageRegistryClientfeature-gate is enabled #163Unpackimplementation #182