use more FDs, avoid race conditions on active fs#4043
use more FDs, avoid race conditions on active fs#4043ThomasWaldmann merged 12 commits intoborgbackup:masterfrom
Conversation
9879189 to
0bdeaf4
Compare
Codecov Report
@@ Coverage Diff @@
## master #4043 +/- ##
=========================================
+ Coverage 84.29% 84.39% +0.1%
=========================================
Files 37 37
Lines 9430 9467 +37
Branches 1572 1573 +1
=========================================
+ Hits 7949 7990 +41
+ Misses 1036 1034 -2
+ Partials 445 443 -2
Continue to review full report at Codecov.
|
|
this is how a borg create (py36) with the code of this PR currently looks like: |
|
same with py37 (using the new fd-based scandir()): |
9d382b3 to
0559cdc
Compare
0559cdc to
379303c
Compare
c394deb to
a257690
Compare
a257690 to
f1fb9d7
Compare
st param was only given at the root paths of the recursion. we can just drop that and make the code simpler.
always open the file and then do all operations with the fd: - fstat - read - get xattrs, acls, bsdflags
also: - add and use OsOpen context manager - add O_NONBLOCK, O_NOFOLLOW, O_NOCTTY (inspired by gnu tar)
we'll add/remove some args soon, so many pos args would be just bad.
races via changing path components can be avoided by opening the parent directory and using parent_fd + file_name combination with *at style functions to access the directories' contents.
if scandir does not get a path, it can't prefix it in front of the filename in the direntries it returns, so dirent.path == dirent.name. thus, we just only use dirent.name and construct the full path.
acl_get: remove assumption that having an FD means it is a regular file, we try to use FDs a much as possible. only get the default acl for directories - other fs objects are not expected to have a default acl. the path needs to be encoded also for the case when we have an fd, it is needed to get the default acl for directories. also: micro-opt: encode path later, not needed for ISLNK check. acl_set: remove the "if False" branch, it is the same here: the fd-based api only supports access ACLs, but not default ACLs, so we always need to use the path-based api here.
for fd-based operations, we would have to open the file, but for char / block devices this has unwanted effects, even if we do not read from the device. thus, we use path (or dir_fd + name) based ops here.
f1fb9d7 to
85b711f
Compare
on linux, acls are based on xattrs, so do these closeby: 1. listxattr -> keys (without acl related keys) 2. for all keys: getxattr 3. acl-related getxattr by acl library
scenario: - x is a regular file - borg does stat on x: is a regular file - so borg dispatches to process_file - attack: x gets replaced by a symlink (mv symlink x) - in process_file, borg opens x and must not follow the symlink nor continue processing as a normal file, but rather error in open() due to NOFOLLOW.
|
@verygreen ah, right. guess one can not change the file type of an existing inode (except with a sector editor maybe? :-) ). is there any way / chance to kill the old inode and get a new one (with different file type) with the same inode number? if not, considering this code is called once per file (fs item), I'll remove the file type check again. |
|
(we also need to consider fs with synthetic inode numbers, like network fs for that) |
|
yes, technically inode numbers could be resused, so if you want to be 100% sure this did not happen, you need to look at inode number and generation, but generation apparently is not exposed outside of the kernel for whatever reason. That said conditions of inode reuse in relatively short time should still be very rare even on synthetic fses (esp. there since there they typically just do like a constantly increasing counter). Here is a real world example of inode reuse: GoogleCloudPlatform/gcsfuse#57 On ext3/4 the fs tries to avoid reusing recently deleted inodes unless you really don't have any other left. Probably many other filesystems do this too. |
we must avoid a handler processing a fs item of wrong file type, so check if it has changed.
also: added a test for this.
4cda43d to
23eeded
Compare
guess this is mostly done, fixes issues mentioned in #908 and #1038.