Skip to content

refactor(snowflake): use snowflake-connector-python directly, drop ibis dependency#2268

Merged
douenergy merged 3 commits into
mainfrom
refactor/snowflake-native-driver
May 20, 2026
Merged

refactor(snowflake): use snowflake-connector-python directly, drop ibis dependency#2268
douenergy merged 3 commits into
mainfrom
refactor/snowflake-native-driver

Conversation

@goldmedal
Copy link
Copy Markdown
Collaborator

@goldmedal goldmedal commented May 14, 2026

The snowflake connector now uses snowflake-connector-python with the
pyarrow result format, eliminating the ibis layer.

Highlights

  • connector/snowflake.py (new): cursor.execute + cursor.fetch_arrow_all
    returns a native PyArrow Table; dry_run uses cursor.describe
    (parses + plans without executing). Both password and key-pair
    (private_key) auth are supported. statement_timeout from
    kwargs is lifted into session_parameters.STATEMENT_TIMEOUT_IN_SECONDS.
  • connector/factory.py: snowflake is now routed to
    wren.connector.snowflake and removed from the ibis-needing set.
  • model/data_source.py::get_snowflake_connection: returns a
    snowflake.connector.SnowflakeConnection instead of an ibis backend.
  • pyproject.toml: snowflake extra ->
    snowflake-connector-python[pandas]>=3.10 (drops ibis-framework[snowflake]).

Tests

  • tests/connectors/test_snowflake.py covers both password and key-pair
    auth code paths, statement-timeout extraction, arrow-result slicing,
    None-result handling, ProgrammingError -> WrenError mapping for both
    query and dry_run, and idempotent close — all via mocked
    snowflake.connector (no live account required).
  • New snowflake pytest marker and just test-snowflake recipe.

Test plan

  • just test-snowflake -> 16 passed
  • just lint -> clean

Summary by CodeRabbit

  • New Features

    • Native Snowflake connector with password & private-key auth, warehouse support, statement-timeout handling, query execution, dry-run, and safe close behavior.
  • Dependencies

    • Snowflake optional extra switched to the official Snowflake connector (pandas-enabled).
  • Tests

    • Added comprehensive Snowflake connector tests covering auth modes, query/dry-run behavior, error wrapping, and connection lifecycle.
  • Chores

    • Added a dedicated test recipe to run Snowflake connector tests.

Review Change Stack

…is dependency

The snowflake connector now uses snowflake-connector-python with the
pyarrow result format, eliminating the ibis layer.

Highlights
- `connector/snowflake.py`: cursor.execute + cursor.fetch_arrow_all
  returns a native PyArrow Table; `dry_run` uses cursor.describe.
- `model/data_source.py::get_snowflake_connection`: returns
  `snowflake.connector.SnowflakeConnection`; supports key-pair auth.
- `pyproject.toml`: snowflake extra ->
  `snowflake-connector-python[pandas]>=3.10`.

Tests
- `tests/connectors/test_snowflake.py` exercises both password and
  key-pair auth code paths via mocked connections.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code core labels May 14, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c240f3ff-4d47-4deb-91df-ce8180ac83c7

📥 Commits

Reviewing files that changed from the base of the PR and between 885ee90 and 6c50531.

📒 Files selected for processing (2)
  • core/wren/src/wren/connector/snowflake.py
  • core/wren/src/wren/model/data_source.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • core/wren/src/wren/model/data_source.py
  • core/wren/src/wren/connector/snowflake.py

Walkthrough

Replaces ibis Snowflake integration with a native snowflake-connector-python implementation: adds SnowflakeConnector, a make_snowflake_connection factory, updates registry and data-source wiring, adds a snowflake test recipe and marker, and adds comprehensive mocked unit tests.

Changes

Native Snowflake Connector

Layer / File(s) Summary
Build and test infrastructure setup
core/wren/pyproject.toml, core/wren/justfile, core/wren/tests/conftest.py
Project optional dependency changed to snowflake-connector-python[pandas]>=3.10, test-snowflake just recipe added, and snowflake pytest marker registered.
Native Snowflake connector implementation
core/wren/src/wren/connector/snowflake.py
Adds _build_connection_params (password vs private-key auth, optional warehouse, statement_timeoutsession_parameters.STATEMENT_TIMEOUT_IN_SECONDS), make_snowflake_connection, SnowflakeConnector (methods: __init__, query, dry_run, close) with ProgrammingError → WrenError wrapping, and create_connector.
Connector factory registration
core/wren/src/wren/connector/factory.py
Routes DataSource.snowflake to wren.connector.snowflake and removes it from _NEEDS_DATA_SOURCE, so create_connector is called with only connection_info.
Data source library integration
core/wren/src/wren/model/data_source.py
DataSourceExtension.get_snowflake_connection now delegates to wren.connector.snowflake.make_snowflake_connection instead of building params and calling ibis.snowflake.connect.
Snowflake connector test suite
core/wren/tests/connectors/test_snowflake.py
Adds mocked unit tests for connection parameter building (auth precedence, warehouse, statement_timeout mapping), connector initialization, query execution (Arrow results, None → empty table, limit slicing), error wrapping for query and dry-run, and idempotent close.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • yichieh-lu

Poem

🐰 A native Snowflake path so bright,
No ibis chains to hold you tight,
Connector speaks and arrows sing,
Errors wrapped, tests on the wing,
Hoppity code—now ready for flight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main refactoring: replacing ibis-based Snowflake integration with direct snowflake-connector-python usage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/snowflake-native-driver

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/wren/src/wren/model/data_source.py (1)

403-426: ⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

Critical missing functionality: get_snowflake_connection lacks statement_timeout parameter handling.

The method at lines 403-426 duplicates connection parameter building logic from _build_connection_params in connector/snowflake.py, but critically omits the statement_timeout extraction (lines 32-38 in connector/snowflake.py). This parameter is converted to STATEMENT_TIMEOUT_IN_SECONDS in the Snowflake session_parameters, and its absence means statement_timeout will be silently ignored when connections are created through this code path.

Additionally, the private_key validation differs: this method checks truthiness (and info.private_key), while the connector version checks explicitly (and connection_info.private_key is not None), potentially handling empty strings differently.

Consolidate both implementations to use the same logic, or ensure get_snowflake_connection includes the statement_timeout session parameter handling.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/model/data_source.py` around lines 403 - 426, The
get_snowflake_connection function builds connection params but omits translating
statement_timeout into the Snowflake session parameter and uses a looser
private_key truthiness check; update get_snowflake_connection to mirror
_build_connection_params in connector/snowflake.py: check private_key with "is
not None" (not truthiness) when deciding between private_key vs password, and if
info.statement_timeout (or equivalent field) is present, add session_parameters
with STATEMENT_TIMEOUT_IN_SECONDS set to int(info.statement_timeout) (or convert
seconds appropriately) before calling snowflake.connector.connect; also preserve
merging of info.kwargs and include warehouse handling exactly as in the
connector implementation so both code paths behave identically.
🧹 Nitpick comments (2)
core/wren/src/wren/connector/snowflake.py (1)

50-67: 💤 Low value

Consider wrapping additional Snowflake exception types.

The query method only catches ProgrammingError (SQL syntax/semantic errors). Other exceptions such as DatabaseError, OperationalError, or InterfaceError (network issues, connection failures, etc.) will propagate unwrapped.

While this may be intentional to let infrastructure errors bubble up with full detail, consider whether wrapping these with appropriate ErrorCode values would improve error handling consistency.

Example: Broader exception handling
     def query(self, sql: str, limit: int | None = None) -> pa.Table:
         try:
             with self.connection.cursor() as cursor:
                 cursor.execute(sql)
                 arrow_table = cursor.fetch_arrow_all()
         except snowflake.connector.errors.ProgrammingError as e:
             raise WrenError(
                 ErrorCode.INVALID_SQL,
                 str(e),
                 phase=ErrorPhase.SQL_EXECUTION,
                 metadata={DIALECT_SQL: sql},
             ) from e
+        except snowflake.connector.errors.DatabaseError as e:
+            raise WrenError(
+                ErrorCode.SQL_EXECUTION,
+                str(e),
+                phase=ErrorPhase.SQL_EXECUTION,
+                metadata={DIALECT_SQL: sql},
+            ) from e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/connector/snowflake.py` around lines 50 - 67, The query
method only wraps snowflake.connector.errors.ProgrammingError; extend exception
handling in query to also catch other Snowflake exceptions (e.g.,
snowflake.connector.errors.DatabaseError, OperationalError, InterfaceError) and
wrap them in WrenError with an appropriate ErrorCode (e.g., ErrorCode.DATABASE
or ErrorCode.CONNECTION) and ErrorPhase (e.g., ErrorPhase.CONNECTION or
ErrorPhase.SQL_EXECUTION) while preserving the original exception as the cause;
keep the existing ProgrammingError handling as-is, and ensure each WrenError
includes metadata like {DIALECT_SQL: sql} so callers get consistent, informative
errors from query and don't leak uncaught connector exceptions.
core/wren/pyproject.toml (1)

49-49: Consider specifying an upper bound for snowflake-connector-python to prevent unexpected major version upgrades.

The dependency snowflake-connector-python[pandas]>=3.10 is valid—version 3.10 and all versions through the current stable release (4.5.0) are stable with no known compatibility issues. However, the unbounded constraint >=3.10 allows for future major version jumps that could introduce breaking changes. Consider using a constraint like >=3.10,<5.0 for more predictable dependency resolution.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/pyproject.toml` at line 49, Update the dependency constraint for
snowflake-connector-python to add an upper bound to avoid unintentional breaking
major upgrades: replace the unbounded requirement
"snowflake-connector-python[pandas]>=3.10" with a bounded range such as
"snowflake-connector-python[pandas]>=3.10,<5.0" (or another appropriate major
upper bound) so dependency resolution remains predictable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/wren/src/wren/connector/snowflake.py`:
- Around line 24-25: The password check in the Snowflake connector can raise
AttributeError if SnowflakeConnectionInfo lacks a password attribute; update the
branch in the connector (the elif that currently tests connection_info.password
is not None and sets params["password"] in snowflake.py) to first verify
hasattr(connection_info, "password") and that it's not None before calling
get_secret_value(), mirroring how private_key and warehouse are guarded.
- Around line 32-38: The code currently does int(statement_timeout) in the
Snowflake connector without validating the input; update the block handling
statement_timeout in the function that builds params so that you validate or
safely convert the value before setting session_parameters: check that
statement_timeout is numeric (or attempt int() inside a try/except catching
ValueError/TypeError) and raise or log a clear error if conversion fails,
otherwise set session_parameters["STATEMENT_TIMEOUT_IN_SECONDS"] to the
validated integer; reference the existing variables statement_timeout, params,
and session_parameters when applying the fix.

---

Outside diff comments:
In `@core/wren/src/wren/model/data_source.py`:
- Around line 403-426: The get_snowflake_connection function builds connection
params but omits translating statement_timeout into the Snowflake session
parameter and uses a looser private_key truthiness check; update
get_snowflake_connection to mirror _build_connection_params in
connector/snowflake.py: check private_key with "is not None" (not truthiness)
when deciding between private_key vs password, and if info.statement_timeout (or
equivalent field) is present, add session_parameters with
STATEMENT_TIMEOUT_IN_SECONDS set to int(info.statement_timeout) (or convert
seconds appropriately) before calling snowflake.connector.connect; also preserve
merging of info.kwargs and include warehouse handling exactly as in the
connector implementation so both code paths behave identically.

---

Nitpick comments:
In `@core/wren/pyproject.toml`:
- Line 49: Update the dependency constraint for snowflake-connector-python to
add an upper bound to avoid unintentional breaking major upgrades: replace the
unbounded requirement "snowflake-connector-python[pandas]>=3.10" with a bounded
range such as "snowflake-connector-python[pandas]>=3.10,<5.0" (or another
appropriate major upper bound) so dependency resolution remains predictable.

In `@core/wren/src/wren/connector/snowflake.py`:
- Around line 50-67: The query method only wraps
snowflake.connector.errors.ProgrammingError; extend exception handling in query
to also catch other Snowflake exceptions (e.g.,
snowflake.connector.errors.DatabaseError, OperationalError, InterfaceError) and
wrap them in WrenError with an appropriate ErrorCode (e.g., ErrorCode.DATABASE
or ErrorCode.CONNECTION) and ErrorPhase (e.g., ErrorPhase.CONNECTION or
ErrorPhase.SQL_EXECUTION) while preserving the original exception as the cause;
keep the existing ProgrammingError handling as-is, and ensure each WrenError
includes metadata like {DIALECT_SQL: sql} so callers get consistent, informative
errors from query and don't leak uncaught connector exceptions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6054d1a4-d63a-46c8-ae4d-2d89abdad4ae

📥 Commits

Reviewing files that changed from the base of the PR and between 544ecab and 9f950e3.

⛔ Files ignored due to path filters (1)
  • core/wren/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • core/wren/justfile
  • core/wren/pyproject.toml
  • core/wren/src/wren/connector/factory.py
  • core/wren/src/wren/connector/snowflake.py
  • core/wren/src/wren/model/data_source.py
  • core/wren/tests/conftest.py
  • core/wren/tests/connectors/test_snowflake.py

Comment thread core/wren/src/wren/connector/snowflake.py
Comment thread core/wren/src/wren/connector/snowflake.py
Delegate to _make_snowflake_connection so the connection-param build
logic stays in one place and statement_timeout is mapped to
STATEMENT_TIMEOUT_IN_SECONDS session parameter on this code path too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
core/wren/src/wren/model/data_source.py (1)

403-406: ⚡ Quick win

Avoid importing the private _make_snowflake_connection symbol across modules.

The leading underscore on _make_snowflake_connection signals module-private API, yet it's being consumed from another module here. Either rename it to a public symbol (e.g., make_snowflake_connection) in wren.connector.snowflake and import that, or expose a small public factory wrapper that this method can call. This keeps the public/private contract honest and avoids future maintenance friction if the helper's signature is changed under the assumption it has no external callers.

♻️ Proposed change at the call site (paired with renaming in wren/connector/snowflake.py)
     `@staticmethod`
     def get_snowflake_connection(info: SnowflakeConnectionInfo):
-        from wren.connector.snowflake import _make_snowflake_connection  # noqa: PLC0415
-
-        return _make_snowflake_connection(info)
+        from wren.connector.snowflake import make_snowflake_connection  # noqa: PLC0415
+
+        return make_snowflake_connection(info)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/model/data_source.py` around lines 403 - 406, The call
site is importing a private symbol _make_snowflake_connection from
wren.connector.snowflake; either rename that private helper to a public API
(e.g., make_snowflake_connection) in wren.connector.snowflake and update this
function get_snowflake_connection to import and call make_snowflake_connection,
or add a small public factory function (e.g., make_snowflake_connection) in
wren.connector.snowflake that delegates to the underscored helper and import
that here; update the import line in get_snowflake_connection to reference the
new public symbol and run tests/linters to ensure no other modules rely on the
private name.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@core/wren/src/wren/model/data_source.py`:
- Around line 403-406: The call site is importing a private symbol
_make_snowflake_connection from wren.connector.snowflake; either rename that
private helper to a public API (e.g., make_snowflake_connection) in
wren.connector.snowflake and update this function get_snowflake_connection to
import and call make_snowflake_connection, or add a small public factory
function (e.g., make_snowflake_connection) in wren.connector.snowflake that
delegates to the underscored helper and import that here; update the import line
in get_snowflake_connection to reference the new public symbol and run
tests/linters to ensure no other modules rely on the private name.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4427e161-eb74-4113-a390-33551183e934

📥 Commits

Reviewing files that changed from the base of the PR and between 9f950e3 and 885ee90.

📒 Files selected for processing (1)
  • core/wren/src/wren/model/data_source.py

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@goldmedal
Copy link
Copy Markdown
Collaborator Author

Responses to remaining CodeRabbit review-body items:

  • Duplicate connection-building logic (data_source.py:403-426): Addressed in 885ee90 (delegated to make_snowflake_connection) and 6c50531 (renamed factory to public).
  • Wrap more exception types (snowflake.py:50-67): Intentional. Connection/network errors are bubbled raw so callers can distinguish them from SQL errors; only ProgrammingError corresponds to user-supplied SQL.
  • Upper bound on snowflake-connector-python (pyproject.toml:49): Consistent with other unbounded extras in this file. We'd rather react to a real incompatibility than pin defensively.

@goldmedal goldmedal requested a review from douenergy May 14, 2026 05:49
@douenergy douenergy merged commit eaedc8d into main May 20, 2026
10 checks passed
goldmedal added a commit that referenced this pull request May 22, 2026
Ibis is no longer a load-bearing dependency for the bulk of connectors
after the recent direct-connector refactors (canner/postgres/mysql/mssql/
trino/clickhouse/athena dropped ibis in #2313, snowflake in #2268); the
intro now only credits Apache DataFusion as the engine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core dependencies Pull requests that update a dependency file python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants