Skip to content

Conversation

@Joffreybvn
Copy link
Contributor

This PR implements support for fetchone() in the ODBCHook and the Databricks SQL Hook. Fixing the bug described in #32319 (comment). It also refactors and improves a bit the docstring.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member

potiuk commented Dec 11, 2023

The static check handled in #36162

"""Transform the databricks Row objects into a JSON-serializable list of rows."""
if result is not None:
"""Transform the databricks Row objects into JSON-serializable lists."""
if isinstance(result, list):
Copy link
Contributor

@utkarsharma2 utkarsharma2 Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statement = f"DESCRIBE TABLE {table.name};"
hook.run(statement, parameters=parameters, handler=lambda x: x.fetchall())

with this code hook.run() will return [[val1, val2, val3],[val1, val2, val3], [val1, val2, val3]] instead of [row(), row(), row()] isn't this a breaking change??

Copy link
Contributor

@utkarsharma2 utkarsharma2 Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible solution is to inherit from row class and introduce deserialize and serialize method that way we can deal with this in a similar way as dataframes.

@potiuk
Copy link
Member

potiuk commented Dec 13, 2023

@Joffreybvn -> what's your take ?

@Joffreybvn
Copy link
Contributor Author

Joffreybvn commented Dec 13, 2023

Just to clarify, the support for fetchall - with the breaking change - was implemented in #32319 .

But yes, it is a breaking change. For ODBC, the Row became a NamedTuple - which is a very 'soft' breaking change. For Databricks, there is no NamedTuple because the 'make_serializable' method from #31780, which was turning Row into tuples already, was simply moved down from the Operator into the Hook.

Thus, I can make it a NamedTuple too. Adding serialize and deserialize was also discussed in the first PR, and led to this ADR. If Databricks wants to comply with it, it should return a tuple / namedtuple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants