Skip to content
This repository was archived by the owner on Jul 25, 2022. It is now read-only.
This repository was archived by the owner on Jul 25, 2022. It is now read-only.

Support custom TableProvider #45

@jychen7

Description

@jychen7

Background

I would like to use datafusion-python to query Bigtable. In Rust, datafusion-bigtable have implement BigtableDataSource as custom TableProvider.

Problem

I tried to add register_table in #46 and expose a python BigtableTable in datafusion-bigtable at datafusion-contrib/datafusion-bigtable#3.

The problem is how to convert python BigtableTable to python Table? Or how to serialize/deserialize rust TableProvider to some Python Object?

classDiagram
    BigtableTable_Python <|-- PyBigtableTable_Rust
    Table_Python <|-- PyTable_Rust
    TableProvider_Rust <|-- BigtableDatasource_Rust
    TableProvider_Rust <|-- ListingTable_Rust
    ListingTable_Rust <|-- CSV
    ListingTable_Rust <|-- Parquet
    ListingTable_Rust <|-- JSON
    ListingTable_Rust <|-- Avro
    class BigtableTable_Python{
    }
    class PyBigtableTable_Rust{
        table: TableProvider_Rust
    }
    class Table_Python{
    }
    class PyTable_Rust{
        table: TableProvider_Rust
    }
            
Loading

following is a non-working example, because bigtable.table() is TableProvider(Rust) and have no corresponding python object

from datafusion import ExecutionContext
from datafusion._internal import Table as DatafusionTable
from datafusion_bigtable import BigtableTable

@pytest.fixture
def df_table():
    bigtable = BigtableTable(
        project="emulator",
        xxx
    )
    return DatafusionTable(bigtable.table())

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions