refactor(snowflake): use snowflake-connector-python directly, drop ibis dependency#2268
Conversation
…is dependency The snowflake connector now uses snowflake-connector-python with the pyarrow result format, eliminating the ibis layer. Highlights - `connector/snowflake.py`: cursor.execute + cursor.fetch_arrow_all returns a native PyArrow Table; `dry_run` uses cursor.describe. - `model/data_source.py::get_snowflake_connection`: returns `snowflake.connector.SnowflakeConnection`; supports key-pair auth. - `pyproject.toml`: snowflake extra -> `snowflake-connector-python[pandas]>=3.10`. Tests - `tests/connectors/test_snowflake.py` exercises both password and key-pair auth code paths via mocked connections. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughReplaces ibis Snowflake integration with a native snowflake-connector-python implementation: adds SnowflakeConnector, a make_snowflake_connection factory, updates registry and data-source wiring, adds a snowflake test recipe and marker, and adds comprehensive mocked unit tests. ChangesNative Snowflake Connector
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/wren/src/wren/model/data_source.py (1)
403-426:⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoffCritical missing functionality:
get_snowflake_connectionlacksstatement_timeoutparameter handling.The method at lines 403-426 duplicates connection parameter building logic from
_build_connection_paramsinconnector/snowflake.py, but critically omits the statement_timeout extraction (lines 32-38 in connector/snowflake.py). This parameter is converted toSTATEMENT_TIMEOUT_IN_SECONDSin the Snowflake session_parameters, and its absence means statement_timeout will be silently ignored when connections are created through this code path.Additionally, the private_key validation differs: this method checks truthiness (
and info.private_key), while the connector version checks explicitly (and connection_info.private_key is not None), potentially handling empty strings differently.Consolidate both implementations to use the same logic, or ensure
get_snowflake_connectionincludes the statement_timeout session parameter handling.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/model/data_source.py` around lines 403 - 426, The get_snowflake_connection function builds connection params but omits translating statement_timeout into the Snowflake session parameter and uses a looser private_key truthiness check; update get_snowflake_connection to mirror _build_connection_params in connector/snowflake.py: check private_key with "is not None" (not truthiness) when deciding between private_key vs password, and if info.statement_timeout (or equivalent field) is present, add session_parameters with STATEMENT_TIMEOUT_IN_SECONDS set to int(info.statement_timeout) (or convert seconds appropriately) before calling snowflake.connector.connect; also preserve merging of info.kwargs and include warehouse handling exactly as in the connector implementation so both code paths behave identically.
🧹 Nitpick comments (2)
core/wren/src/wren/connector/snowflake.py (1)
50-67: 💤 Low valueConsider wrapping additional Snowflake exception types.
The
querymethod only catchesProgrammingError(SQL syntax/semantic errors). Other exceptions such asDatabaseError,OperationalError, orInterfaceError(network issues, connection failures, etc.) will propagate unwrapped.While this may be intentional to let infrastructure errors bubble up with full detail, consider whether wrapping these with appropriate
ErrorCodevalues would improve error handling consistency.Example: Broader exception handling
def query(self, sql: str, limit: int | None = None) -> pa.Table: try: with self.connection.cursor() as cursor: cursor.execute(sql) arrow_table = cursor.fetch_arrow_all() except snowflake.connector.errors.ProgrammingError as e: raise WrenError( ErrorCode.INVALID_SQL, str(e), phase=ErrorPhase.SQL_EXECUTION, metadata={DIALECT_SQL: sql}, ) from e + except snowflake.connector.errors.DatabaseError as e: + raise WrenError( + ErrorCode.SQL_EXECUTION, + str(e), + phase=ErrorPhase.SQL_EXECUTION, + metadata={DIALECT_SQL: sql}, + ) from e🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/connector/snowflake.py` around lines 50 - 67, The query method only wraps snowflake.connector.errors.ProgrammingError; extend exception handling in query to also catch other Snowflake exceptions (e.g., snowflake.connector.errors.DatabaseError, OperationalError, InterfaceError) and wrap them in WrenError with an appropriate ErrorCode (e.g., ErrorCode.DATABASE or ErrorCode.CONNECTION) and ErrorPhase (e.g., ErrorPhase.CONNECTION or ErrorPhase.SQL_EXECUTION) while preserving the original exception as the cause; keep the existing ProgrammingError handling as-is, and ensure each WrenError includes metadata like {DIALECT_SQL: sql} so callers get consistent, informative errors from query and don't leak uncaught connector exceptions.core/wren/pyproject.toml (1)
49-49: Consider specifying an upper bound for snowflake-connector-python to prevent unexpected major version upgrades.The dependency
snowflake-connector-python[pandas]>=3.10is valid—version 3.10 and all versions through the current stable release (4.5.0) are stable with no known compatibility issues. However, the unbounded constraint>=3.10allows for future major version jumps that could introduce breaking changes. Consider using a constraint like>=3.10,<5.0for more predictable dependency resolution.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/pyproject.toml` at line 49, Update the dependency constraint for snowflake-connector-python to add an upper bound to avoid unintentional breaking major upgrades: replace the unbounded requirement "snowflake-connector-python[pandas]>=3.10" with a bounded range such as "snowflake-connector-python[pandas]>=3.10,<5.0" (or another appropriate major upper bound) so dependency resolution remains predictable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@core/wren/src/wren/connector/snowflake.py`:
- Around line 24-25: The password check in the Snowflake connector can raise
AttributeError if SnowflakeConnectionInfo lacks a password attribute; update the
branch in the connector (the elif that currently tests connection_info.password
is not None and sets params["password"] in snowflake.py) to first verify
hasattr(connection_info, "password") and that it's not None before calling
get_secret_value(), mirroring how private_key and warehouse are guarded.
- Around line 32-38: The code currently does int(statement_timeout) in the
Snowflake connector without validating the input; update the block handling
statement_timeout in the function that builds params so that you validate or
safely convert the value before setting session_parameters: check that
statement_timeout is numeric (or attempt int() inside a try/except catching
ValueError/TypeError) and raise or log a clear error if conversion fails,
otherwise set session_parameters["STATEMENT_TIMEOUT_IN_SECONDS"] to the
validated integer; reference the existing variables statement_timeout, params,
and session_parameters when applying the fix.
---
Outside diff comments:
In `@core/wren/src/wren/model/data_source.py`:
- Around line 403-426: The get_snowflake_connection function builds connection
params but omits translating statement_timeout into the Snowflake session
parameter and uses a looser private_key truthiness check; update
get_snowflake_connection to mirror _build_connection_params in
connector/snowflake.py: check private_key with "is not None" (not truthiness)
when deciding between private_key vs password, and if info.statement_timeout (or
equivalent field) is present, add session_parameters with
STATEMENT_TIMEOUT_IN_SECONDS set to int(info.statement_timeout) (or convert
seconds appropriately) before calling snowflake.connector.connect; also preserve
merging of info.kwargs and include warehouse handling exactly as in the
connector implementation so both code paths behave identically.
---
Nitpick comments:
In `@core/wren/pyproject.toml`:
- Line 49: Update the dependency constraint for snowflake-connector-python to
add an upper bound to avoid unintentional breaking major upgrades: replace the
unbounded requirement "snowflake-connector-python[pandas]>=3.10" with a bounded
range such as "snowflake-connector-python[pandas]>=3.10,<5.0" (or another
appropriate major upper bound) so dependency resolution remains predictable.
In `@core/wren/src/wren/connector/snowflake.py`:
- Around line 50-67: The query method only wraps
snowflake.connector.errors.ProgrammingError; extend exception handling in query
to also catch other Snowflake exceptions (e.g.,
snowflake.connector.errors.DatabaseError, OperationalError, InterfaceError) and
wrap them in WrenError with an appropriate ErrorCode (e.g., ErrorCode.DATABASE
or ErrorCode.CONNECTION) and ErrorPhase (e.g., ErrorPhase.CONNECTION or
ErrorPhase.SQL_EXECUTION) while preserving the original exception as the cause;
keep the existing ProgrammingError handling as-is, and ensure each WrenError
includes metadata like {DIALECT_SQL: sql} so callers get consistent, informative
errors from query and don't leak uncaught connector exceptions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 6054d1a4-d63a-46c8-ae4d-2d89abdad4ae
⛔ Files ignored due to path filters (1)
core/wren/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
core/wren/justfilecore/wren/pyproject.tomlcore/wren/src/wren/connector/factory.pycore/wren/src/wren/connector/snowflake.pycore/wren/src/wren/model/data_source.pycore/wren/tests/conftest.pycore/wren/tests/connectors/test_snowflake.py
Delegate to _make_snowflake_connection so the connection-param build logic stays in one place and statement_timeout is mapped to STATEMENT_TIMEOUT_IN_SECONDS session parameter on this code path too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
core/wren/src/wren/model/data_source.py (1)
403-406: ⚡ Quick winAvoid importing the private
_make_snowflake_connectionsymbol across modules.The leading underscore on
_make_snowflake_connectionsignals module-private API, yet it's being consumed from another module here. Either rename it to a public symbol (e.g.,make_snowflake_connection) inwren.connector.snowflakeand import that, or expose a small public factory wrapper that this method can call. This keeps the public/private contract honest and avoids future maintenance friction if the helper's signature is changed under the assumption it has no external callers.♻️ Proposed change at the call site (paired with renaming in
wren/connector/snowflake.py)`@staticmethod` def get_snowflake_connection(info: SnowflakeConnectionInfo): - from wren.connector.snowflake import _make_snowflake_connection # noqa: PLC0415 - - return _make_snowflake_connection(info) + from wren.connector.snowflake import make_snowflake_connection # noqa: PLC0415 + + return make_snowflake_connection(info)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/model/data_source.py` around lines 403 - 406, The call site is importing a private symbol _make_snowflake_connection from wren.connector.snowflake; either rename that private helper to a public API (e.g., make_snowflake_connection) in wren.connector.snowflake and update this function get_snowflake_connection to import and call make_snowflake_connection, or add a small public factory function (e.g., make_snowflake_connection) in wren.connector.snowflake that delegates to the underscored helper and import that here; update the import line in get_snowflake_connection to reference the new public symbol and run tests/linters to ensure no other modules rely on the private name.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@core/wren/src/wren/model/data_source.py`:
- Around line 403-406: The call site is importing a private symbol
_make_snowflake_connection from wren.connector.snowflake; either rename that
private helper to a public API (e.g., make_snowflake_connection) in
wren.connector.snowflake and update this function get_snowflake_connection to
import and call make_snowflake_connection, or add a small public factory
function (e.g., make_snowflake_connection) in wren.connector.snowflake that
delegates to the underscored helper and import that here; update the import line
in get_snowflake_connection to reference the new public symbol and run
tests/linters to ensure no other modules rely on the private name.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 4427e161-eb74-4113-a390-33551183e934
📒 Files selected for processing (1)
core/wren/src/wren/model/data_source.py
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Responses to remaining CodeRabbit review-body items:
|
Ibis is no longer a load-bearing dependency for the bulk of connectors after the recent direct-connector refactors (canner/postgres/mysql/mssql/ trino/clickhouse/athena dropped ibis in #2313, snowflake in #2268); the intro now only credits Apache DataFusion as the engine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The snowflake connector now uses
snowflake-connector-pythonwith thepyarrow result format, eliminating the ibis layer.
Highlights
connector/snowflake.py(new):cursor.execute+cursor.fetch_arrow_allreturns a native PyArrow Table;
dry_runusescursor.describe(parses + plans without executing). Both password and key-pair
(
private_key) auth are supported.statement_timeoutfromkwargsis lifted intosession_parameters.STATEMENT_TIMEOUT_IN_SECONDS.connector/factory.py: snowflake is now routed towren.connector.snowflakeand removed from the ibis-needing set.model/data_source.py::get_snowflake_connection: returns asnowflake.connector.SnowflakeConnectioninstead of an ibis backend.pyproject.toml: snowflake extra ->snowflake-connector-python[pandas]>=3.10(dropsibis-framework[snowflake]).Tests
tests/connectors/test_snowflake.pycovers both password and key-pairauth code paths, statement-timeout extraction, arrow-result slicing,
None-result handling,
ProgrammingError->WrenErrormapping for bothqueryanddry_run, and idempotent close — all via mockedsnowflake.connector(no live account required).snowflakepytest marker andjust test-snowflakerecipe.Test plan
just test-snowflake-> 16 passedjust lint-> cleanSummary by CodeRabbit
New Features
Dependencies
Tests
Chores