Skip to content

Conversation

@owtaylor
Copy link
Contributor

org.opencontainers.image.ref.name values are allowed to be much more complex
than the code allowed for. Change the validation to match the spec. Since
ref-names can contain a :, split from the path on the first :, not the
last :. (Path containing a : were already considered invalid.)

fixes #294

@runcom
Copy link
Member

runcom commented Aug 11, 2017

Pretty much the same we're going to do here #318 (comment)

I'm fine merging this and have #318 rebased.

@runcom
Copy link
Member

runcom commented Aug 11, 2017

@mtrmac wdyt?

@owtaylor owtaylor force-pushed the oci-ref-name-validation branch 2 times, most recently from 90087b5 to 46b313d Compare August 11, 2017 13:33
@owtaylor
Copy link
Contributor Author

Will fix up the tests and update

org.opencontainers.image.ref.name values are allowed to be much more complex
than the code allowed for. Change the validation to match the spec. Since
ref-names can contain a :, split from the path on the first :, not the
last :. (Path containing a : were already considered invalid.)

fixes containers#294

Signed-off-by: Owen W. Taylor <otaylor@fishsoup.net>
@owtaylor owtaylor force-pushed the oci-ref-name-validation branch from 46b313d to 98f428e Compare August 11, 2017 15:35
@mtrmac
Copy link
Collaborator

mtrmac commented Aug 12, 2017

ACK, though it may be easier to wait for #318 instead of having Urvashi rebase (and e.g. re-add the forbidden colon in ValidatePolicyConfigurationScopes, due to the identity changes in her PR). @runcom I’ll leave the timing/coordination decision up to you if you don’t mind.

@owtaylor
Copy link
Contributor Author

In effect, this is really the same as #318 - #318 looks a lot more complicated because it renames 'tag' to 'image' and changes some details of the semantics, but the basic changes of:

A) Validating ref-names against the image spec
B) Splitting the scope on the first colon rather than the last colon

are the same. So I don't really think it makes sense to land both of them. If the details of #318 can be improved a bit, I'm happy if that is landed instead of this.

@cyphar
Copy link
Contributor

cyphar commented Aug 12, 2017

The problem with this is that now we can't support directories that have : in them. Really, we should be using URI-like #partial tags for tags so that the URI semantics make more sense (and we could use the native Go URI parsing for them).

@owtaylor
Copy link
Contributor Author

owtaylor commented Aug 12, 2017

That definitely is a good point. Paths with ':' were previously in a gray zone - if you added an explicit tag to them, writing "oci:$path:latest" rather than "oci:$path", then they were safe, but otherwise you could get surprised. And they were disallowed entirely in policy configuration scopes. With this patch (or the one in #318), they simply don't work at all, with no workarounds.

The nice thing about URI syntax is that arbitrary path would be allowed - the bad thing is that you need URI encoding to generate the reference for arbitrary paths - if your path has #?%^\[]() or non-ascii, it's going to need to be percent-encoded. I can imagine that a lot of users would get this wrong - just write "oci:$path#$tag" in a shell script or concatenate strings in some other language. And how do you even get it right in a shell script?

# is also not command line friendly since it requires quoting all OCI references passed on the command line so that # isn't interpreted as a comment.

An alternate to URI syntax would be using :: as the delimiter between path and ref-name (:: is not allowed in a ref-name). Then either leave paths with :: in the same gray-zone that paths with : were in before, or try to improve things by allowing <path>:: and treating it as a path without a ref-name. It's still necessary to disallow paths with '::' in them for policy configuration, but "oci:$path::" would be a safe way to build a reference without a ref-name - it would even be possible to require that a reference has a :: in it to catch faulty code ahead of time.

@cyphar
Copy link
Contributor

cyphar commented Aug 12, 2017

# is also not command line friendly since it requires quoting all OCI references passed on the command line so that # isn't interpreted as a comment.

That's actually not true, # in the middle of a "word" doesn't get evaluated as a comment. It needs to have a space in front of it. Though in ZSH it does get evaluated quite weirdly (I think it does some weird matching stuff).

An alternate to URI syntax would be using :: as the delimiter between path and ref-name

That might work. My reason for suggesting URI syntax is because : is a misuse of URI-like syntax (a mistake that Docker has helped perpetuate) and semantically that would be the best choice. But you're quite right that it makes path encoding a massive pain (though I would argue people are more likely to have : in a pathname than #).

@owtaylor
Copy link
Contributor Author

That's actually not true, # in the middle of a "word" doesn't get evaluated as a comment. It needs to have a space in front of it.

Huh - a feature of Bourne shell syntax that I never noticed in 20+ years... :-) You could use # without going full-on-URI-parsing, but making things look more like URI's, without parsing them as URIs' is probably not an improvement - at least oci:/foo/bar::baz doesn't look like an URI.

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 26, 2017

The problem with this is that now we can't support directories that have : in them.

<rant>

Yeah, in-band signaling (special semantics characters inside a string) is always painful, if not always a bad idea.

Sometimes I really wish the UNIX ”Write programs to handle text streams, because that is a universal interface.” thing was recognized to be the quick hack suitable for 16-bit machines that it is, just like silently truncating lines longer than 80 characters, instead of some wise philosophy; then we would have no text formatting and parsing, and everything were well-typed data.

But that would force us to write GUIs with folder selection for directories, and the like, instead of CLIs; or force users to use well-typed programming languages; neither is practically possible given the number of programmers working on this, and the existing ecosystems the software needs to work with.

</rant>


Guessing without any data, most users of the text format will probably not be that careful to read the documentation in full detail, to always perfectly escape, etc.; so the practical effect is that the separator character/string is, for most users, forbidden, and the escape mechanisms, if any, will be used by few users—it might even be an unwelcome surprise to the naive shell users just concatenating strings.

Also, every image identity, or at least its namespaces, need to be representable in the policy configuration, and that policy configuration should not be confusing to users; so it’s not just a matter of allowing an API call with arbitrary inputs when the text format is not used for input, we need to be able to format even the API-created references in an unambiguous and well-readable way for the policy. URI encoding in the policy is… possible, but it would be nice to be able to do without.

So I’m leaning towards the plain : syntax, exactly because the container ecosystem, and containers/image, are already using it other places, and hoping that we will never have to support completely arbitrary paths. (See also some users of containers/image trying to parse the same string as transport:name and a name within some hard-coded default transport, and using whichever works; that’s, strictly speaking, completely ambiguous—but convenient enough that such code is being written.)

But this is a quite weakly held opinion; I might well be wrong and a critical user of :-containing paths (or #-containing paths, or whatever) might turn up tomorrow. Ultimately, this is a bet on OCI being new, so deployments of OCI hopefully can alter their deployments to work around whatever text restrictions we place—but only a guess, not something we can know for sure.

@owtaylor
Copy link
Contributor Author

If you ever want to support windows, then prohibiting : in paths is clearly a problem. But beyond that, I agree that users should be able to mostly work any issues with : in paths.

And I agree that users aren't going to read documentation carefully, so if there's a normal way to do it that mostly works (concatenate strings), and an extra hoop that is safe for all paths (escape, add a trailing separator) - you don't gain much ecosystem robustness.

Requiring a trailing separator for a ref_name-less reference - oci:<path>:: or oci:<path># (*) does add robustness, but at an expense in familiarity that may not be worth it.

(*) a separator that can't appear in a ref-name - you still have to be able to break oci:<path><separator><ref_name> apart reliably.

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 28, 2017

(Shouldn’t this conversation happen mostly in #318?)

If you ever want to support windows, then prohibiting : in paths is clearly a problem.

Good point—but, actually, using Windows-reserved characters is somewhat desirable; UNIX allows almost anything in paths, but Windows does have a few reserved characters, and those are pretty unlikely to appear in the NAS paths and other kinds of pre-determined paths which can’t be changed. Of the Windows-reserved characters, : would be one of the more painful ones to use (it can only happen in a specific position, not in arbitrary file names, but oci:t:something could reasonably be both a directory ./t and something on the T: drive).

Requiring a trailing separator for a ref_name-less reference - oci:<path>:: or oci:<path># (*) does add robustness, but at an expense in familiarity that may not be worth it.

Yeah… OTOH #318, and #155 , are moving oci: more towards the repository-like use cases, so having the ”read the only image, from a repo, however it is named” case use a bit unusual syntax might be acceptable.

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 31, 2017

This should have been fixed in #318; @owtaylor If I’m incorrect, please reopen.

@mtrmac mtrmac closed this Aug 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Valid oci reference names disallowed

4 participants