Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions xarray_sql/repl.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
"""Simple SQL REPL for xarray-sql.

Run with: python -m xarray_sql.repl

Starts with a demo "air" table (xarray tutorial dataset). Type SQL and see
results. Commands: .quit or .exit to exit.
"""

import sys

# Enable up/down arrow history for input() (Unix/Mac built-in; Windows: pip install pyreadline3)
try:
import readline # noqa: F401
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package was deprecated in 2022. Check out https://pypi.org/project/gnureadline/.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using a library to handle this instead of rolling it ourselves.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add this dep as an extra in the pyproject.toml.

except ImportError:
pass

import xarray as xr

from .sql import XarrayContext

MAX_DISPLAY_ROWS = 100


def main():
ctx = XarrayContext()
# Demo table: streaming path (no read_all); requires _native to be built
print("Loading demo table 'air' (xarray tutorial air_temperature)...")
air = xr.tutorial.open_dataset("air_temperature").chunk({"time": 240})
ctx.from_dataset("air", air)
print("Ready. Type SQL or .quit / .exit to exit.\n")

while True:
try:
line = input("xarray-sql> ").strip()
except EOFError:
print()
break

if not line:
continue
if line in (".quit", ".exit"):
break

try:
result = ctx.sql(line).to_pandas()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using native datafusion dataframes over pandas. I think all we'd need to do here is to call collect() instead of to_pandas().

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that datafusion naturally truncates long lines.

display = result.head(MAX_DISPLAY_ROWS)
print(display.to_string())
if len(result) > MAX_DISPLAY_ROWS:
print(f"... ({len(result) - MAX_DISPLAY_ROWS} more rows)")
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
Comment on lines +50 to +51
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn if we should swallow and log errors or exit upon error. For now, this seems fine.

print()

sys.exit(0)


if __name__ == "__main__":
main()
Loading