Add a generic abstraction for filesystem based stores#3364
Add a generic abstraction for filesystem based stores#3364fahedouch merged 3 commits intocontainerd:mainfrom
Conversation
d500496 to
e7bf627
Compare
| if err != nil { | ||
| log.L.Warn(err) | ||
| } else { | ||
| volEnts, err := volStore.List(false) |
There was a problem hiding this comment.
List() is expensive.
Replaced with the much lighter Count.
| return false, volGetErr | ||
| } | ||
| // FIXME: this is racy. See note in up_volume.go | ||
| options.VolumeExists = volStore.Exists |
There was a problem hiding this comment.
There is now a Exists method, much cheaper than Get
| if err != nil { | ||
| return err | ||
| } | ||
| err = volStore.Lock() |
There was a problem hiding this comment.
Replacing all of that with a volStore.Prune(func) that does locking.
| @@ -1,41 +0,0 @@ | |||
| /* | |||
There was a problem hiding this comment.
Moved inside the store.
These things are restrictions linked to the filesystem implementation, and not conceptually to restrictions we necessarily want to apply to identifiers.
Put otherwise: they are orthogonal concerns.
| @@ -0,0 +1,37 @@ | |||
| /* | |||
There was a problem hiding this comment.
This replace the whole uber go mock thing for mountutil.
03d3241 to
1a7f3d0
Compare
| copyFileContent("/etc/resolv.conf", resolvConfPath) | ||
|
|
||
| etcHostsPath, err := hostsstore.AllocHostsFile(dataStore, m.globalOptions.Namespace, containerID) | ||
| hs, err := hostsstore.New(dataStore, m.globalOptions.Namespace) |
There was a problem hiding this comment.
Write only once, instead of allocating first, then writing.
| return fmt.Errorf("identifier %q must match pattern %q: %w", s, AllowedIdentfierChars, errdefs.ErrInvalidArgument) | ||
| } | ||
|
|
||
| if err := validatePlatformSpecific(s); err != nil { |
There was a problem hiding this comment.
Platform validation of identifiers is a filesystem concern.
These are moved into the store.Store, as they are orthogonal concerns.
a83aafc to
0dce441
Compare
|
Rebase broke some tests. |
073c2cd to
0d4485f
Compare
CI is green. |
09a7aa9 to
60c68b9
Compare
|
Rebased. Pending green CI. |
|
|
60c68b9 to
c340e25
Compare
My bad. Thanks for catching it. (CI failing on Hub being 500) |
|
big UP for this PR 👍 , I have two points to raise here:
|
Thanks a lot @fahedouch ack on splitting in more commits |
Signed-off-by: apostasie <spam_blackhole@farcloser.world>
Signed-off-by: apostasie <spam_blackhole@farcloser.world>
Signed-off-by: apostasie <spam_blackhole@farcloser.world>
c340e25 to
bf89c08
Compare
Although our codebase makes use of flock when we manipulate the filesystem (normally...) , we have had a relatively significant number of issues in the past with our various "file stores". They typically fall in the following categories:
Issues arising from these are generally heinous to debug, as they usually manifest in a flaky way (as they may only happen in high-concurrency situations / under pressure), or tend to bubble-up in different places making seemingly unrelated operations fail in cryptic ways.
Truth is, it is not 100% trivial to achieve a concurrency-safe and atomic filesystem storage system, even a very simple one like what we do - operating system crash or restarts, or generally interrupted writes, for example, are often overlooked as something that has to be thought about, to boot.
While we have fixed a number of these issues, there is clearly more lingering, and we are bound to produce even more in the future short of having a stronger basis.
I do not think it is fair to assume that everyone will be very mindful and intentional with their locking, or that the filesystem is a good enough, "safe" abstraction...
Furthermore, and even when done right, these things are inherently repetitive (mutex / filesystem lock / path sanitization / atomic write) and we now have enough of these "filestores" to justify having a shared base:
This PR first commit does introduce an interface and a reference (filesystem) implementation of a reusable
Storecomponent that every "store" can leverage and that provides:Structs that leverage it are thus freed from having to deal with mutex / filelocks, or underlying implementations (eg: filesystem). They can rely on a simple API that is responsible for just that, and can focus solely on the abstraction they have to provide for their specific data.
Hopefully, if things go our way, we should never see again a bug report about hosed json files / broken systems on restart, or flakyness involving stores...
The second commit is a mere cleanup of the (useless) mock for the volume store
The third commit is a refactor of our various stores to leverage that:
Incidentally, the following known issues are getting fixed or partially fixed:
go.uber.org/mock#3325Finally, I do appreciate this is a far reaching PR, as it is touching code in the critical path of basically everything.
To a large extent, though, this is mere refactoring and literally no novelty.