segmentation fault from opening large single column csv with small blockszie pyarrow.csv.open_csv()

### Describe the usage question you have. Please include as many useful details as  possible.


platform: 
`NAME="Ubuntu"
VERSION="23.04 (Lunar Lobster)"`
pyarrow version:
`pyarrow                14.0.1`
`pyarrow-hotfix          0.5 `
python version:
`Python 3.11.4 (main, Jun  9 2023, 07:59:55) [GCC 12.3.0] on linux`


I have a very large single column csv file (about 63 million rows). I was hoping to create a lazy file streamer that reads one entry from the csv file at a time. I know each entry in my file has a length of 12 chars, so I tried setting block size to 13 (+1 for \n) with the pyarrow.csv.open_csv function.
`import pyarrow.csv as csv`
`c_options = csv.ConvertOptions(column_types={'dne': pa.float32()})`
`r_options = csv.ReadOptions(skip_rows_after_names=8200,use_threads=True, column_names=["dne"],block_size=13)`
`stream = csv.open_csv(file, convert_options = c_options,
                    read_options = r_options 
)`
this code functions properly as expected, but when i change the `skip_rows_after_names` param of read options to 8300 I start to get segmentation faults when in the open_csv function. How to fix this (or am I using it wrong)? I want to be able to use only a portion of at (like from row 98885 to 111200)


I was able to produce this error on another computer with the exact same platform and versions. The file was created with 
`with open(f"feature_{i}.csv", "w+") as f:
        for i in range(FILE_LEN):
            n = random.uniform(-0.5, 0.5)
            nn = str(n)[:12]
            f.write(f"{nn}\n")`


### Component(s)

Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

segmentation fault from opening large single column csv with small blockszie pyarrow.csv.open_csv() #38878

Describe the usage question you have. Please include as many useful details as possible.

Component(s)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

segmentation fault from opening large single column csv with small blockszie pyarrow.csv.open_csv() #38878

Description

Describe the usage question you have. Please include as many useful details as possible.

Component(s)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions