Skip to content

Conversation

@strake
Copy link
Contributor

@strake strake commented May 8, 2018

No description provided.

@strake strake force-pushed the at branch 4 times, most recently from a6894f6 to a570f74 Compare May 8, 2018 23:44
@strake
Copy link
Contributor Author

strake commented May 21, 2018

ping

@hvr
Copy link
Member

hvr commented May 21, 2018

Sorry haven't yet had time to review this. I'm a bit worried about two things here: a) routing everything through openat(2) and thus replace open(2) and b) these are POSIX.1-2008 entry points

Are we sure all non-EOL'ed target platforms provide/support openat(2) and fdopendir(3)?

@strake
Copy link
Contributor Author

strake commented May 21, 2018

The package description specifies POSIX.1-2008.

Is somewhere a list of supported platforms of this package? I checked all supported Unix platforms of GHC and they all have both openat and fdopendir — indeed, both GNU libc on Linux and FreeBSD libc now call openat rather than open (not sure about OS X).

@hvr
Copy link
Member

hvr commented May 21, 2018

I only mentioned POSIX.1-2008 because this means these could be recent enough that some less mainstream platforms we support may not have supported them yet. And if that's the case, we'd need to consider conditional compilation & autoconf tests.

Unfortunately I don't have a comprehensive list of supported platforms but we also support e.g. AIX, Solaris, OpenBSD and Android

@strake
Copy link
Contributor Author

strake commented May 22, 2018

PTAL

@strake
Copy link
Contributor Author

strake commented Jul 3, 2018

ping

1 similar comment
@strake
Copy link
Contributor Author

strake commented Aug 19, 2018

ping

@hvr hvr added this to the 2.8.0.0 milestone Aug 19, 2018
@strake
Copy link
Contributor Author

strake commented Nov 23, 2019

@hvr ping

Copy link
Member

@hasufell hasufell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvr what is blocking this? This is required to fix dangerous bugs in the directory package: haskell/directory#97 and #134

module System.Posix.Directory.Common (
DirStream(..), CDir, CDirent, DirStreamOffset(..),
#ifdef HAVE_FDOPENDIR
openDirStreamFd,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this should be in its own module System.Posix.Directory.Fd, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

data {-# CTYPE "struct dirent" #-} CDirent

#ifdef HAVE_FDOPENDIR
-- | @openDirStreamFd fd@ calls @fdopendir@ to obtain a
Copy link
Member

@hasufell hasufell Jan 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should document that you must not close the file descriptor afterwards with a link to the POSIX documentation and that the file descriptor is closed when the stream is closed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@hasufell
Copy link
Member

@Rufflewind @hvr ping

@hasufell
Copy link
Member

@hvr ping

@cartazio
Copy link

@chessai @hvr @Bodigrim This seems like a pretty decent patch, what are the support surface area concerns blocking it at the moment?

like, what OS's does the haskell Unix need to support, and whats missing to make this or something similar a good patch?

@hasufell
Copy link
Member

Guys, why is no one responding here? It's discouraging for contributors to wait 2 years for actionable input.

OpenMode(..),
OpenFileFlags(..), defaultFileFlags,
openFd, createFile,
openFd, openFdAt, createFile, createFileAt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not openFdAt and createFileAt be guarded by #ifdef HAVE_OPENAT?

-> IO Fd
openFd = openFdAt Nothing

openFdAt :: Maybe Fd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a haddock comment for openFdAt.

createFile :: FilePath -> FileMode -> IO Fd
createFile = createFileAt Nothing

createFileAt :: Maybe Fd -> FilePath -> FileMode -> IO Fd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a haddock comment for createFileAt.

-> IO Fd
openFd = openFdAt Nothing

openFdAt :: Maybe Fd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a haddock comment for openFdAt.

createFile :: RawFilePath -> FileMode -> IO Fd
createFile = createFileAt Nothing

createFileAt :: Maybe Fd -> RawFilePath -> FileMode -> IO Fd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a haddock comment for createFileAt.

OpenMode(..),
OpenFileFlags(..), defaultFileFlags,
openFd, createFile,
openFd, openFdAt, createFile, createFileAt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not openFdAt and createFileAt be guarded by #ifdef HAVE_OPENAT?

@cartazio
Copy link

thank you for getting this rolling @Bodigrim

@hasufell
Copy link
Member

hasufell commented Apr 14, 2020

Please double check this code on mac!

I have a similar code: https://github.com/hasufell/hpath/blob/06b5a46cf81371204870ad3ebf78af2e59fbbdd7/hpath-posix/src/System/Posix/RawFilePath/Directory/Traversals.hs#L179

And fdopendir usage here causes d_name to have the first char truncated on mac only:

https://travis-ci.org/github/hasufell/hpath/jobs/674795844#L1167-L1168

I'm not sure why. The use of unsafeCoerce seems safe (just converting to newtype).


EDIT: @cartazio debugged this and it appears there are two possible functions for fdopendir https://opensource.apple.com/source/Libc/Libc-1244.1.7/include/dirent.h.auto.html

One of them crashes hard and the above code will trigger it. The fix was to use:

foreign import capi unsafe "dirent.h fdopendir"
    c_fdopendir :: Posix.Fd -> IO (Ptr CDir)

@cartazio
Copy link

emphasizing what @hasufell says,

at least on OSX, a number of newer unix calls really need to be bound via capi (an implicit header aware c wrapper) or an explicit c call with the correct system headers included

reproducing the example also mentioned above that i cooked up and shared with julian:

foreign import capi unsafe "dirent.h fdopendir"
    c_fdopendir :: Posix.Fd -> IO (Ptr CDir)

the header part of the capi syntax is crucial for this to compile and link correctly! (its actually impossible to bind the correct underlying system call symbol directly)

@l29ah
Copy link

l29ah commented Aug 31, 2020

wtf just merge it already

@hasufell
Copy link
Member

wtf just merge it already

The code is broken. See above.

@l29ah
Copy link

l29ah commented Aug 31, 2020

The code works on linux and doesn't break anything. If it's bugged on some other platform, it could be addressed separately as a bugfix. Meanwhile, MacOS doesn't even strive to be POSIX these days it seems: https://www.quora.com/Is-Mac-OS-X-more-POSIX-compliant-than-Linux/answer/Madars-Oz

@cartazio
Copy link

Julian and I figured out how to correctly bind some of this stuff on OS X. But incorrectly binding it will make programs flat out crash and silently corrupt data.

A pr that does the capi indirection would be safe to merge. But this pr is flat out wrong. Someone prepare a pr that does the capi stuff and then we can make it work

@cartazio
Copy link

cartazio commented Sep 1, 2020

Cc @Bodigrim and victor

@Bodigrim
Copy link
Contributor

Bodigrim commented Nov 14, 2020

@l29ah currently the only OS forbidden to use unix package is Windows:

unix/unix.cabal

Lines 67 to 71 in f7b956c

if os(windows)
-- This package currently supports neither Cygwin nor MinGW,
-- therefore os(windows) is effectively not supported.
build-depends: unbuildable<0
buildable: False

So even if MacOS is not POSIX compliant, we cannot just silently break it.

@l29ah
Copy link

l29ah commented Nov 15, 2020

Also the function definition can be put under a conditional if you don't want to create a false hope for Mac users.

@cartazio
Copy link

Bodigrim: it looks like the binding is capi based now. @hasufell what was the min version of mac we found that supported this call?

@Bodigrim
Copy link
Contributor

@l29ah Providing conditional API is an option, but so far unix avoided it, sticking to a role of a common denominator for all non-Windows platforms. One could position unix as a wrapper for POSIX - and woe to systems which are not, but I'd say that such approach would greatly diminish its utility.

@hs-viktor
Copy link
Contributor

I've rebased this PR on master and pushed two fixup commits to address the CApiFFI issue for MacOS and missing #if guards for openat functions.

All CI targets now pass. It just remains to do a final closer review, but we should be able to merge this (perhaps squashed) soon barring unexpected issues.

Copy link
Contributor

@hs-viktor hs-viktor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost done, but I am concerned about the conditionally defined functions. I don't see how a dependent application or library can safely make use of these functions, in the sense that it will at least reliably compile on all platforms. Do they need to also run configure and use HAVE_OPENAT, ...?

There should likely be some way to discover whether these are available at runtime. So some sort of stub implementation needs to exist, and some way to test whether the function is expected to work, or whether the caller should take an alternative code path (or give up).

Perhaps just simple boolean predicates that are exported along with these?

Unless we've reached the point where the only supported Posix platforms all have openat, et. al. and we just fail to build unix otherwise. So that these are then always available.

#ifdef HAVE_FDOPENDIR
-- | @unsafeOpenDirStreamFd fd@ calls @fdopendir@ to obtain a directory stream
-- for @fd@. @fd@ must not be otherwise used after this; see
-- <https://pubs.opengroup.org/onlinepubs/9699919799/functions/fdopendir.html POSIX>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth mentioning that if an exception is thrown here, closing the file descriptor remains the responsibility of the caller, who may need to use something like bracketOnError to handle this possibility.

Alternatively, perhaps this function should arrange to close the descriptor if c_fdopendir fails. This way the descriptor is never leaked. I'm inclined to recommend this approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bodigrim I see you've approved the MR, but perhaps you did not notice the above issue/question amidst the flurry of activity to resolve the easier obstacles. I do think we still need to do something here...

@hs-viktor
Copy link
Contributor

hs-viktor commented Feb 15, 2021

There should likely be some way to discover whether these are available at runtime. So some sort of stub implementation needs to exist, and some way to test whether the function is expected to work, or whether the caller should take an alternative code path (or give up).

Perhaps just simple boolean predicates that are exported along with these?

One simple way to handle these is to export a Maybe T where T is the type of the optionally implemented function, and not export the function itself. A caller can then write:

 case getOpenFdAt of
  Just f -> f ...
  Nothing -> ...

which, with appropriate inlining, will optimise out the test and take just the appropriate branch.

@hs-viktor
Copy link
Contributor

The openat function was added in:

  • Linux 2.6.16 with glibc 2.4 (Mar 2006)
  • FreeBSD 8.0 (Nov 2009)
  • OpenBSD 5.0 (Nov 2011)
  • MacOS 10.10 (Oct 2014)
  • NetBSD 7.0 (Aug 2015)

Has it been established long enough that we can just require it unconditionally?

@hs-viktor
Copy link
Contributor

The timeline for fdopendir seems to be:

  • glibc 2.4
  • FreeBSD 8.0
  • OpenBSD 5.0
  • MacOS 10.10
  • NetBSD 6.0

Pretty much the same, but slightly older with NetBSD. So if we can require openat outright, we could do the same with fdopendir and dispense with the CPP conditions.

@Bodigrim
Copy link
Contributor

Has it been established long enough that we can just require it unconditionally?

Yes, I believe 5+ years are enough. It is better to enable it unconditionally now than have a breaking switch from Maybe T to T in future, or pay for Maybe infinitely long.

@cartazio
Copy link

cartazio commented Feb 15, 2021 via email

@hs-viktor
Copy link
Contributor

The latest commit drops the #if guards. Builds on all the platforms on which run CI. I also added missing haddocks. Please review.

Separately from this MR, I really dislike the way that the file open flags are handled. We only support a fixed subset, hard-code as a list of booleans in a structure. This is IMHO awful. It screams out for pattern synonyms and a single numeric flag field, that users can set to values that are not yet known to the library, if they know what they're doing.

I should open a separate issue...

Copy link
Contributor

@hs-viktor hs-viktor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disposition of file descriptor passed to fdopendir when the call fails remains as yet unaddressed. Should it be automatically closed (I think so), but otherwise callers need to know that they have to catch the exception and do something with the descriptor.

@Bodigrim
Copy link
Contributor

I'm in favor of closing it automatically.

@vdukhovni
Copy link
Contributor

I'm in favor of closing it automatically.

OK, I'll arrange to close it with c_close, but it is perhaps worth noting that if any of the descriptors being passed into the library are being used in some other thread with threadWaitRead or threadWaitWrite, then they have to be closed with closeFdWith, rather than just c_close, so that the IO Manager can do appropriate cleanup.

Nothing in unix uses the IO Manager interfaces, but we perhaps need to make it clear to users that they need to be careful with passing such descriptors into unix calls that might close the descriptor. Not likely to be an issue with a directory handle, so I think it is fairly safe to not worry too much with fdopendir, but something that may warrant a closer look elsewhere...

strake and others added 3 commits February 16, 2021 02:39
This way the caller never ends up owning the fd after the call.
Otherwise, cleanup of either the DirStream or else the file-descriptor
would be difficult to implement correctly.
@hs-viktor
Copy link
Contributor

Final cut:

  • unsafeOpenDirStreamFd input descriptor never left unpainted
  • squashed
  • changelog updated

Looking at the surrounding code makes me wish for a time-machine to address other issues that will require some care, but not in this PR. :-(

@hs-viktor hs-viktor dismissed their stale review February 16, 2021 07:48

Issues addressed.

@hs-viktor hs-viktor requested a review from Bodigrim February 16, 2021 07:49
Copy link
Contributor

@Bodigrim Bodigrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

way to prevent openDirStream from following symlinks

8 participants