Skip to content

Add a cleaned up child_processes API that cleans the API up for child process spawning, like how fs/promises cleaned up fs #54799

@dead-claudia

Description

@dead-claudia

What is the problem this feature will solve?

child_process is, in my experience, one of the most commonly wrapped APIs by far. It's especially common to wrap it in promise environments.

It all stems from a few major issues:

  1. Child processes have several one-time events, and none of them map cleanly to just one promise or even one promise sequence.
  2. It's non-trivial to create a wrapper that interops with streams more seamlessly.
  3. It's inherently error-prone to use directly, especially when it comes to shell script invocation on Windows.

What is the feature you are proposing to solve the problem?

I'm thinking of the following API, in child_process/promises:

  • result = await promises.exec(command, args?, options?) to spawn a normal command
  • result = await promises.system(command, args?, options?) to spawn a shell command
  • result = await promises.fork(command, args?, options?) to spawn a child with an IPC channel

Options and arguments:

  • command is what to run.
    • exec: the binary to run, may be a file: URL
    • system: the shell script to run
    • fork: the Node script to run, may be a file: URL
  • args is an array of arguments to pass to the script or command and it defaults to the empty array.
    • Everything is fully escaped as needed, matching cross-spawn's behavior.
  • Most of the usual options properties still work, with the same defaults:
    • options.detached
    • options.cwd
    • options.env
    • options.argv0
    • options.uid
    • options.gid
  • options.signal is now an object, where keys are the signal names and values are AbortSignals and async iterables that can trigger them.
  • options.ref determines whether the process starts out ref'd.
  • options.execPath for system and fork and represents the path to use. Unlike in child_process.spawn, this is not a complete command. Defaults:
    • system: "sh" in *nix, process.env.COMSPEC || "cmd.exe" on Windows
    • fork: "node"
  • options.execArgv provides the list of arguments to pass before passing the script. Defaults:
    • system: ["-c"] on *nix, ["/d", "/s", "/c"] on Windows
    • fork: []
  • Set options.pathLookup to true (default) to use the system path to locate the target binary, false to resolve it based on the current working directory. On Unix-like systems, true also enables interpreters to work.
    • For system and fork, this is always set to true and cannot be configured.
    • On Windows, this also enables %PathExt% traversal.
    • This is useful independently of shell access. Plus, Linux does lookupPath: true natively with execve.
  • options.fds is an object where each numerical index corresponds to a descriptor to set in the child. This is not necessarily an array, though one could be passed. Default is {0: 0, 1: 1, 2: 2} to inherit those descriptors. Possible entry values:
    • "close": explicitly close FD, cannot be used for FD 0/1/2
    • "null": connect to the system null device
    • MessagePort instance (fork only): open an IPC port
      • These ports are transferred, so the internal code can just send directly to the linked port.
      • When the child's IPC channel closes, the linked MessagePort on the other side of the channel is also closed in the same way it is for workers where one end closes.
    • Numeric FD, fs/promises file handle, net.Socket, etc: Pass a given file descriptor directly
    • readableStream: Expose a writable pipe and read from it using the given stream
      • Use BufferReader to read from buffers and strings
    • writableStream: Expose a readable pipe and write into it using the given stream
      • Use BufferWriter to write into buffers
  • Set options.fds.inherit to true to inherit all FDs not specified in options.fds. Default is false, in which FDs 0/1/2 are opened to the null device and all others are closed.

The return value is a Promise that settles when the child terminates.

  • It resolves to an empty object if the command exited normally with a code of 0.
  • It throws an error with exitCode and signalCode properties if it exited with any other code.
  • pid = await handle.spawned resolves with the PID on spawn and rejects on spawn error.

Additional classes in stream:

  • writer = new BufferReader(target | max)
    • Extends stream.Writable
    • Pass either a target buffer source to fill or a max byte length
    • writer.bytesWritten: Get the number of bytes written
    • writer.consume(): Reset the write state and return the previously written buffer data. If it's not writing to an external target, it's possible to avoid the buffer copy.
    • writer.toString(encoding?) is sugar for writer.consume().toString(encoding?)
    • It's recommended to implement this as an alternate stream mode to reduce memory usage and vastly improve performance.
  • reader = new BufferWriter(source, encoding?)
    • Extends stream.Readable
    • Pass a source string (with optional encoding) or buffer source to read from
    • reader.bytesRead: Get the number of bytes read
    • Readable.from(buffer | string) should return instances of this instead
    • It's recommended to implement this as an alternate stream mode to reduce memory usage and vastly improve performance.
  • duplex.reader(), duplex.writer(): Return the read or write half of a duplex stream, sharing the same internal state

And in process:

  • port = process.ipc(n=3): Get a (cached) MessagePort for a given IPC descriptor, throwing if it's not a valid descriptor.
    • Having multiple IPC ports can be useful for delimiting control messages from normal messages.
  • result = await process.inspectFD(n) accepts an FD and returns its type and read/write state.
    • File: {kind: "file", readable, writable}
      • On *nix, readable and writable can be determined via fcntl(F_GETFL, fd) on *nix (it's been in the POSIX standard for a couple decades)
      • On Windows, readable and writable can be determined via two calls to ReOpenFile or one call to NtQueryObject with class ObjectBasicInformation. (They say it can change, but it may be possible to get a stability promise out of them since the page hasn't been modified in over 6 years.)
    • Socket: {kind: "socket", readable, writable, type: "stream-client" | "stream-server" | "dgram"}
    • Terminal stream: {kind: "tty", readable, writable, rows, columns}
      • This can be tested and extracted in one ioctl syscall on Linux
    • IPC port: {kind: "ipc", readable, writable}
    • Other: {kind: "unknown"}
      • *nix pipes are reported as being of this type
    • This is useful in conjunction with systemd for verifying that a given FD is actually open before attempting to use it.

Things I'm intentionally leaving out:

  • options.serialization - it's always "advanced". This both brings it to close parity with other MessagePort-related APIs, and it speeds up message sending since it's already of the correct input format.
  • options.timeout - just do signal: {SIGTERM: AbortSignal.timeout(ms)}.
  • options.windowsHide - that behavior is just always on as that's what people would generally just expect.
  • options.windowsVerbatimArguments - just use system and string concatenation.
  • Per-FD "inherit" constants in stdio - you can just use the descriptor numbers themselves for that.
  • "pipe" - use a passthrough stream for that.
  • options.encoding - that's a per-descriptor setting now.
  • A global "close" event - it's better to track that per-stream anyways. Plus, it's one of those incredibly error-prone points.

For a summary in the form of TypeScript definitions:

declare module "child_process/promises" {
    export interface WaitError extends Error {
        exitCode: number | undefined
        signalCode: string | undefined
    }

    type SignalSource =
        | AbortSignal
        | AsyncIterable<void>

    type SignalMap = {[Signal in NodeJS.Signals]?: SignalSource}

    type FdSource =
        | "close"
        | "null"
        | MessagePort
        | number
        | import("node:net").Socket
        | import("node:fs").FileHandle
        | Readable | Writable

    interface FdMap {
        [fd: number]: FdSource
        inherit?: boolean
    }

    interface Options {
        detached?: boolean
        cwd?: string
        env?: Record<string, string>
        argv0?: string
        uid?: number
        gid?: number
        signal: SignalMap
        ref?: boolean
        pathLookup?: boolean
        execPath?: string
        execArgv?: string
        fds: FdMap
    }

    interface ProcessHandle extends Promise<void> {
        readonly spawned: Promise<number>
    }

    export function exec(command: string | URL, options: Options): ProcessHandle
    export function exec(command: string | URL, args?: string[], options?: Options): ProcessHandle
    export function system(command: string, options: Options): ProcessHandle
    export function system(command: string, args?: string[], options?: Options): ProcessHandle
    export function fork(command: string | URL, options: Options): ProcessHandle
    export function fork(command: string | URL, args?: string[], options?: Options): ProcessHandle
}

declare module "stream" {
    declare class BufferReader extends Readable {
        constructor(source: BufferSource)
        constructor(source: string, encoding?: NodeJS.BufferEncoding)
        readonly bytesRead: number
    }

    declare class BufferWriter extends Writable {
        constructor(target: BufferSource)
        constructor(maxBytes: numbers)
        readonly bytesWritten: number
        consume(): Buffer
        toString(encoding?: NodeJS.BufferEncoding): string
    }

    interface Duplex {
        reader(): Readable
        writer(): Writable
    }
}

declare module "process" {
    export type InspectFDResult =
        | {kind: "file", readable: boolean, writable: boolean}
        | {kind: "socket", readable: true, writable: true, type: "stream-client" | "stream-server" | "dgram"}
        | {kind: "tty", readable: boolean, writable: boolean, rows: number, columns: number}
        | {kind: "ipc", readable: false, writable: false}
        | {kind: "unknown", readable: false, writable: false}

    export function ipc(fd?: number): MessagePort
    export function inspectFD(fd?: number): Promise<InspectFDResult>
}

What alternatives have you considered?

I considered:

  • Simple handles returned from a promise that resolves on spawn. It could have methods are .ref(), .unref(), .pid, .wait(), and .raise(signal?). The main problem is this, for the common case, requires await start(...).then(h => h.wait()).
  • Captuing stderr in the error message. You may want to pass it through (very common), and it could be extremely long. There are ways to work around this, but it's simpler to just not capture it.
  • handle.ipc as a single port. I don't see why one can't have multiple IPC ports, and it also simplifies the API and the implementation.
  • Something like Execa. This is just too complicated to justify the effort.
  • The status quo. It's consistently very awkward, hence the feature request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    child_processIssues and PRs related to the child_process subsystem.feature requestIssues that request new features to be added to Node.js.

    Type

    No type

    Projects

    Status

    Awaiting Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions