Down::ChunkedIO#pos returns wildly incorrect values

We are using Shrine for handling large CSV uploads that are then stream-processed. We have a progress meter for this which works off the underlying IO object's `#pos` values. For local files, this works perfectly. Once we went into our Staging environment with S3 as the storage engine, using Down under-the-hood, it all broke. It seems that after the first 1K of data, `Down::ChunkedIO#pos` starts returning values much, much higher than they should be - far beyond the end of the file.

For a particular test file of only 3669 bytes comprising around 55 CSV rows plus header, the _size_ reported by the IO object was consistently correct. However, inside the CSV row iterator, the results of `#pos` were:

```
0
1024
1024
1024
1024
1024
1024
1024
1024
1024
1024
1024
1024
1024
3736
6268
8732
11134
13466
15730
17923
20045
22103
24087
26017
27888
29698
31455
33155
34794
36363
37878
39313
40687
41998
43249
44431
45549
46598
47581
48498
49349
50137
50861
51519
52117
52656
53138
53562
53924
54220
54465
54647
54774
54840
54840
```

The start offset is 0. The 1024 offset was presumed to be a chunk size from the CSV processor, but if I tried to rewind to zero and read 1024 bytes, I actually got a very strange 1057 bytes, perfectly aligned to a row end, instead. In any event, it then sits at 1024 for a while and once the CSV parsing seems to have gone past that first "chunk" - be it 1024 or 1057 bytes - then the positions reported become, as you can see, very wrong.

The above was generated with no rewinding or other shenanigans; in psuedocode we have:

```ruby
# shrine_file is our Shrine subclass instance representing the S3 object. The
# encoding specifier is typically UTF-8.
#
# Inside the iterator, io_obj is the Down::ChunkedIO instance. CSV options are:
#
#   {:headers=>true, :header_converters=>:symbol, :liberal_parsing=>true}
#
shrine_file.open(encoding: encoding_specifier) do | io_obj |
  csv = CSV.new(io_obj, **options)

  csv.each do |row|
    puts io_obj.pos
  end
end
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Down::ChunkedIO#pos returns wildly incorrect values #73

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Down::ChunkedIO#pos returns wildly incorrect values #73

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions