-
Notifications
You must be signed in to change notification settings - Fork 10
feat: initial SQL REPL for xarray-sql #118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| """Simple SQL REPL for xarray-sql. | ||
|
|
||
| Run with: python -m xarray_sql.repl | ||
|
|
||
| Starts with a demo "air" table (xarray tutorial dataset). Type SQL and see | ||
| results. Commands: .quit or .exit to exit. | ||
| """ | ||
|
|
||
| import sys | ||
|
|
||
| # Enable up/down arrow history for input() (Unix/Mac built-in; Windows: pip install pyreadline3) | ||
| try: | ||
| import readline # noqa: F401 | ||
| except ImportError: | ||
| pass | ||
|
|
||
| import xarray as xr | ||
|
|
||
| from .sql import XarrayContext | ||
|
|
||
| MAX_DISPLAY_ROWS = 100 | ||
|
|
||
|
|
||
| def main(): | ||
| ctx = XarrayContext() | ||
| # Demo table: streaming path (no read_all); requires _native to be built | ||
| print("Loading demo table 'air' (xarray tutorial air_temperature)...") | ||
| air = xr.tutorial.open_dataset("air_temperature").chunk({"time": 240}) | ||
| ctx.from_dataset("air", air) | ||
| print("Ready. Type SQL or .quit / .exit to exit.\n") | ||
|
|
||
| while True: | ||
| try: | ||
| line = input("xarray-sql> ").strip() | ||
| except EOFError: | ||
| print() | ||
| break | ||
|
|
||
| if not line: | ||
| continue | ||
| if line in (".quit", ".exit"): | ||
| break | ||
|
|
||
| try: | ||
| result = ctx.sql(line).to_pandas() | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I recommend using native datafusion dataframes over pandas. I think all we'd need to do here is to call
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that datafusion naturally truncates long lines. |
||
| display = result.head(MAX_DISPLAY_ROWS) | ||
| print(display.to_string()) | ||
| if len(result) > MAX_DISPLAY_ROWS: | ||
| print(f"... ({len(result) - MAX_DISPLAY_ROWS} more rows)") | ||
| except Exception as e: | ||
| print(f"Error: {e}", file=sys.stderr) | ||
|
Comment on lines
+50
to
+51
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm torn if we should swallow and log errors or exit upon error. For now, this seems fine. |
||
| print() | ||
|
|
||
| sys.exit(0) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This package was deprecated in 2022. Check out https://pypi.org/project/gnureadline/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of using a library to handle this instead of rolling it ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add this dep as an extra in the pyproject.toml.