-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
I saved a file using pandas to_parquet method, but can't read it back in. Here's the full stack trace:
Traceback (most recent call last):
File "src/data/CLXP_pull.py", line 214, in <module>
main()
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/click/core.py", line 722, in _call_
return self.main(*args, **kwargs)
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "src/data/CLXP_pull.py", line 188, in main
results[fullname] = pd.read_parquet(os.path.join(project_dir, "data", "raw", fullname+".parquet"), engine="pyarrow")
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/pandas/io/parquet.py", line 257, in read_parquet
return impl.read(path, columns=columns, **kwargs)
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/pandas/io/parquet.py", line 130, in read
**kwargs).to_pandas()
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/pyarrow/parquet.py", line 939, in read_table
pf = ParquetFile(source, metadata=metadata)
File "/Users/mm51929/projects/2018/03-advisor-recruiting/pyenv/lib/python3.6/site-packages/pyarrow/parquet.py", line 64, in _init_
self.reader.open(source, metadata=metadata)
File "_parquet.pyx", line 651, in pyarrow._parquet.ParquetReader.open
File "error.pxi", line 79, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Arrow error: IOError: [Errno 22] Invalid argumentAny ideas what could cause this? The file itself is 3.6GB.
I'm running pandas==0.22.0.
Reporter: Andy Reagan
Assignee: Wes McKinney / @wesm
Related issues:
Note: This issue was originally created as ARROW-2654. Please see the migration documentation for further details.