-
Notifications
You must be signed in to change notification settings - Fork 688
feat(bigquery): move the bigquery backend back into the main ibis repo #4797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
362feca to
0a44e88
Compare
4d93724 to
9ad71ad
Compare
a9f1703 to
c5b2f6f
Compare
3a634f0 to
58a8347
Compare
|
@cpcloud why do we view this as a good thing? |
|
@jreback Good question, thanks for bringing it up. The primary reason is to prevent the maintenance burden that comes along with a separate repo. I give a more detailed answer to your question here (ibis-project/ibis-bigquery#151). In short, many of the things we thought would be good about having a separate repo in practice increase maintenance work or have a negligible effect on the amount of maintenance work. |
|
@cpcloud sure is this a general change in policy though? or a specific one off for BQ? eg what about a lot of the other google variants or mssql for example |
d834c60 to
bd24e9f
Compare
e9d38e8 to
d0d7624
Compare
f2a7c12 to
bdd83fc
Compare
|
@tswast Friendly ping! Any thoughts on this PR? |
bdd83fc to
c2d767c
Compare
Codecov Report
@@ Coverage Diff @@
## master #4797 +/- ##
==========================================
- Coverage 92.88% 87.73% -5.16%
==========================================
Files 192 204 +12
Lines 21731 22830 +1099
Branches 3011 3124 +113
==========================================
- Hits 20185 20030 -155
- Misses 1129 2389 +1260
+ Partials 417 411 -6
|
9891fcf to
659ba53
Compare
tswast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BQ changes LGTM. I like the "snapshot" structure in the tests.
c6dc72c to
60eab8f
Compare
| def fetch_from_cursor(self, cursor, schema): | ||
| query = cursor.query | ||
| df = query.to_arrow().to_pandas(timestamp_as_object=True) | ||
| query_result = query.result() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tswast Can you take a look at this block of code and say whether this is expected behavior?
The use case is reading from bigquery-public-data.hacker_news.comments, but having ibis-gbq be the billing project.
Without this workaround, the storage API creates a read session in the data project (bigquery-public-data), which causes queries to fail when using the pyarrow functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't expect this to be necessary. If the query succeeded, then I assume the billing project is being set correctly in the client constructor
ibis/ibis/backends/bigquery/__init__.py
Line 169 in 01bd402
| project=new_backend.billing_project, |
ibis/ibis/backends/bigquery/__init__.py
Line 236 in 01bd402
| stmt, job_config=job_config, project=self.billing_project |
I've filed googleapis/python-bigquery#1422 to investigate this further, but I think it's fine to keep this workaround if there really is a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I think there really is a bug. The project from "client" is used instead of the project from the QueryJob.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. Thanks Tim. I'll keep this comment unresolved so the link is easier to find.
a9a2568 to
747e808
Compare
|
Ok, I'm going to merge this in and fix any issues with the CI. Thanks all for the help reviewing, great to see this back in the main repo! |
747e808 to
8d75e33
Compare
8d75e33 to
4c16755
Compare
This PR moves bigquery back into the main ibis repo.
Still working through the failing tests, though many are fixed.Tests are passing.TODOs:
Possible follow ups: