Skip to content

Conversation

@VladaZakharova
Copy link
Contributor


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member

potiuk commented Oct 23, 2022

Needs conflict resolving after string normalization in providers.

@quentin-sommer
Copy link
Contributor

quentin-sommer commented Dec 7, 2022

Hi, I tracked an issue I have to this pr:
I'm using airflow-2.3.4

def _check_schema_fields(self, table_resource):

(In airflow/providers/google/cloud/transfers/gcs_to_bigquery.py)

is executed and try to infer a schema on my newline delimited json file. It obviously doesn't work, I end up with column names containing {".

I'm trying to understand why the function is called on all the data types without checking if source_format is 'CSV'. Maybe I'm just using it wrong? Hope you can help

I could provide the schema_fields every time to make the operator bypass the check but it was convenient to let bigquery detect the schema.

ping @VladaZakharova and @potiuk

@VladaZakharova
Copy link
Contributor Author

@quentin-sommer
Hi Team,
I am working on preparing a PR with a fix for this issue and will share all the updates.

@VladaZakharova
Copy link
Contributor Author

@quentin-sommer
Hi Team :)
Please, check the PR that I have created for the fix to this issue:
#28284

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants