-
Notifications
You must be signed in to change notification settings - Fork 4k
Open
Description
For the purpose of collecting input/output statistics, it is important to know to which "device" these stats pertain, so as not to mix e.g. stats for a local NVMe drive, a NFS-attached drive, a S3 filesystem, or a in-memory buffer reader.
I suggest adding to InputStream this API:
/// \brief An opaque unique id for the device underlying this stream.
///
/// Any implementation is free to fill those bytes as it sees fit,
/// but it should be able to uniquely identify each "device"
/// (for example, a specific local drive, or a specific remote network
/// filesystem).
///
/// A suggested format is "<kind>:<bytes>" where "<kind>"
/// is a short string representing the backend kind
/// (for example "local", "s3"...) and "<bytes>" is a
/// backend-dependent string of bytes (for example a
/// `dev_t` for a POSIX local file).
///
/// This is not required to be printable nor human-readable,
/// and may contain NUL characters.
virtual std::string device_id() const = 0;Reporter: Antoine Pitrou / @pitrou
Related issues:
- [C++] Implement a read range process without caching (is related to)
Note: This issue was originally created as ARROW-17917. Please see the migration documentation for further details.