For a use-case at work, we need support for essentially skipping an entire section in a FileScanner based on some information in the section header block. The fastest way to do this is to seek the underlying stream if it supports it. (We know ours will). FileScanner caches the header for the current section and we cannot modify this cache externally. All we're left with is manually shoveling blocks until the next section header (unless we want to come up with some scheme that opens a file pointer, allocates a scanner and somehow closes the scanner whenever we want to skip ahead).
We've looked into how to do this and are willing to open a PR to address this by exposing a skip_section method on FileScanner. skip_section would check if the underlying stream is seekable and if so, seek to the correct offset. If not, we shovel blocks until we come to a SectionHeaderBlock. We need to cache this section header block so that we can return it in _read_next_block (or directly in __iter__) if it exists (because we'll have already read the header and the stream is not seekable).
Does this seem reasonable? Is this a feature that you'd be willing to support?
For a use-case at work, we need support for essentially skipping an entire section in a
FileScannerbased on some information in the section header block. The fastest way to do this is to seek the underlying stream if it supports it. (We know ours will).FileScannercaches the header for the current section and we cannot modify this cache externally. All we're left with is manually shoveling blocks until the next section header (unless we want to come up with some scheme that opens a file pointer, allocates a scanner and somehow closes the scanner whenever we want to skip ahead).We've looked into how to do this and are willing to open a PR to address this by exposing a
skip_sectionmethod onFileScanner.skip_sectionwould check if the underlying stream is seekable and if so, seek to the correct offset. If not, we shovel blocks until we come to aSectionHeaderBlock. We need to cache this section header block so that we can return it in_read_next_block(or directly in__iter__) if it exists (because we'll have already read the header and the stream is not seekable).Does this seem reasonable? Is this a feature that you'd be willing to support?