Skip to content

add Stream.split(collection, count) ⇒ {splitted_values, rest_of_stream} #2922

@sunaku

Description

@sunaku

Hello,

Please add support for a Stream.split(collection, count) function that returns {splitted_values, rest_of_stream} so that I can continue consuming a stream from where I left off, instead of having to start all over again from the beginning each time I want to consume more of the stream. For example:

{[1, 2], rest_of_stream} = Stream.split(1..5, 2)
[3, 4, 5] = Enum.to_list(rest_of_stream)

In particular, this would allow me to consume a file stream in piecemeal fashion, rather than having to swoop through the entire file in one shot (which is what the Stream module API currently supports):

$ iex
Erlang/OTP 17 [erts-6.1] [source] [64-bit] [smp:2:2] [async-threads:10] [kernel-poll:false]

Interactive Elixir (1.0.2) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> stream = File.stream!("orders.csv") |> Stream.map(&String.rstrip/1)
#Stream<[enum: %File.Stream{line_or_bytes: :line,
  modes: [:raw, :read_ahead, :binary], path: "orders.csv", raw: true},
 funs: [#Function<45.29647706/1 in Stream.map/2>]]>
iex(2)> stream |> Enum.each(&IO.inspect/1)
"id,ship_to,net_amount"
"123,:NC,100.00"
"124,:OK,35.50"
"125,:TX,24.00"
"126,:TX,44.80"
"127,:NC,25.00"
"128,:MA,10.00"
"129,:CA,102.00"
"120,:NC,50.00"
:ok
iex(3)> header = stream |> Enum.take(1)
["id,ship_to,net_amount"]
iex(4)> body = stream |> Stream.map(&( String.split(&1, ",") ))
#Stream<[enum: %File.Stream{line_or_bytes: :line,
  modes: [:raw, :read_ahead, :binary], path: "orders.csv", raw: true},
 funs: [#Function<45.29647706/1 in Stream.map/2>,
  #Function<45.29647706/1 in Stream.map/2>]]>
iex(5)> body |> Enum.each(&IO.inspect/1)
["id", "ship_to", "net_amount"]   # <== here is the problem! the stream got reset
["123", ":NC", "100.00"]
["124", ":OK", "35.50"]
["125", ":TX", "24.00"]
["126", ":TX", "44.80"]
["127", ":NC", "25.00"]
["128", ":MA", "10.00"]
["129", ":CA", "102.00"]
["120", ":NC", "50.00"]
:ok

See also issue #2515 for a related discussion about a more general form of this feature.

Thanks for your consideration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions