Skip to content

Inconsistency in from_parquet and read_parquet wrt. file/glob list #26

@dsevilla

Description

@dsevilla

What happens?

As can be seen in https://github.com/duckdb/duckdb-python/blob/main/src/duckdb_py/duckdb_python.cpp#L707, both from_parquet and read_parquet allow the first parameter to be a string specifying a file or glob or a list of files/globs.

In the python equivalent functions, https://github.com/duckdb/duckdb-python/blob/main/duckdb/__init__.pyi#L348, both appear just with an 'str' argument.

I know these files are auto-generated, but they also give instructions on how to specify that these are tweaks after the generation. I'll add them as a patch so that they can be seen by maintainers.

To Reproduce

duckdb.read_parquet([parquet1, parquet2], ...)
duckdb.from_parquet([parquet1, parquet2], ...)

works, but they are not recognized correctly by type checkers or IDEs.

OS:

OSX, Linux

DuckDB Package Version:

latest

Python Version:

all

Full Name:

Diego Sevilla Ruiz

Affiliation:

University of Murcia

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions