Skip to content

Unable to read a single parquet file from a Delta table with LastUpdated=2023-10-08T20%3A54%3A33.250Z in the path #7877

@rspears74

Description

@rspears74

Describe the bug

I am trying to read parquet files from a Delta table. The parquet files are snappy compressed. My Delta table has 3 partition columns, so the folder structure of the Delta table looks something like this:

table
-- Col1=ABC
---- Col2=123
------ Col3=abc
-------- part-0000-xxxx-asdf.c00.snappy.parquet
-- Col1=XYZ
---- Col2=789
------ Col3=xyz
-------- part-0000-xyzj-qplg.c00.snappy.parquet

I was originally trying to read a list of files (I need to be able to read an arbitrary list of files), but debugging my issue has brought me to trying to read a single parquet file. My code for this is as follows:

async fn read_parquet(path: String) -> Result<(), DataFusionError> {
    let session = SessionContext::new();
    let df = session.read_parquet("/Users/me/Downloads/table/Col1=ABC/Col2=123/Col3=abc/part-0000-xxxx-asdf.c00.snappy.parquet", ParquetReadOptions::default()).await?;
    Ok(())
}

When I run this, I get:

thread 'main' panicked at src/main.rs:7:35:
called `Result::unwrap()` on an `Err` value: ObjectStore(NotFound { path: "/Users/me/Downloads/table/Col1=ABC/Col2=123/Col3=abc/part-0000-xxxx-asdf.c00.snappy.parquet", source: Os { code: 2, kind: NotFound, message: "No such file or directory" } })

HOWEVER, if I run this same code, but instead of passing the full path to the parquet file, I pass only the directory the file is in ("/Users/me/Downloads/table/Col1=ABC/Col2=123/Col3=abc"), I get no such error and I'm able to successfully read the parquet file.

I'm not sure if I'm doing something wrong, or if this is some kind of bug.

To Reproduce

Try to read a single parquet file from a local, partitioned Delta table, using SessionContext::read_parquet.

Expected behavior

I expect the file to be read into DataFusion as a DataFrame.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions