-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-9146] Integrate GCP Video Intelligence functionality for Python SDK #10764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-9146] Integrate GCP Video Intelligence functionality for Python SDK #10764
Conversation
|
Thank you @EDjur. I was hoping that we can implement most of the functionality in tfx_bsl, and add cloud AI services by defining new endpoints. We had a discussion on dev list [1] related to this. |
|
I see. I signed up to the dev mailing list just last week so must've missed that discussion. What does that mean in regards to this ticket? I agree that it would be nice to avoid duplicated behaviour across libraries. |
|
@EDjur - Sorry for the miscommunication. These types of decisions should have reflected on JIRA as well. I feel bad that you spent time on this and we are changing direction after your PR is out. I think what it means for this ticket is that:
@kamilwu may have other thoughts, since he was also working on this. I would like to hear what is his opinion. For this PR, let's try to re-use as much as possible. |
|
Thanks @EDjur.
Video Intelligence API is pretty much different. We should treat it as a fully-managed machine learning service. There is no machine learning model which could be trained and deployed. Instead, videos are annotated using pre-trained custom model provided by Google. All a user has to do is to enable API and make a request (using REST API or client library). The only input is a GCS path to a video file. As a result, there is little behavior that could be shared. If you have any questions regarding tfx_bsl and how it connects with Beam, just let me know, as I'm currently working to provide support for remote inference (an ability to call into services) in tfx_bsl. |
|
@EDjur Because you've added a new dependency, we have to specify it in setup.py file ( |
|
Thank you @kamilwu for the comment. That makes sense to me. Related question, what is a good location for these family of transforms? |
I agree |
…nnotate_video. Added requirement in setup.py
|
Thanks for the feedback! I've adjusted the code to conform to py23 standards as well as added the extra arguments to annotate_video. I also added the |
|
Got one question regarding the naming. Is |
I think it is a good name reflects the underlying service as it is. Not very different from other gcp io with the similar names to the gcp services. |
aaltay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I will wait for @kamilwu to complete his review.
|
LGTM. |
I started work on https://issues.apache.org/jira/browse/BEAM-9247 to integrate the vision API in a similar PTransform. However, here the naming gets more complex. Consider e.g. One solution for the What's your take? @aaltay |
I do not have a good idea. Users could solve this with something like the following: So, maybe it is not a big issue. I agree, it would be good if we can avoid this state. Related, in the io folder we have modules named We can take the io example and change the names by appending ml, your example will look like: What do you think? |
|
retest this please |
|
retest this please |
|
@EDjur your phrase triggers unfortunately won't trigger the tests - there are Jenkins restrictions that won't let you do it |
Gotcha. Thanks for triggering them! |
|
retest this please |
|
I'm having a small discussion regarding the Perhaps this is something we should look at integrating to this PR before (or after considering the use-case seems small) merging too. |
|
retest this please |
|
Thanks for retesting! Test failures look unrelated to this PR. |
|
Test error seems to be related to: https://github.com/apache/beam/pull/10856/files |
…context as part of element tuple
|
retest this please |
1 similar comment
|
retest this please |
|
retest this please |
1 similar comment
|
retest this please |
|
:sdks:python:test-suites:tox:pycommon:docs task is failing -- There might be pydocs issues in the change. |
|
Seems yapf doesn't format docstrings 🤷♂ Should be fixed now. |
|
retest this please |
|
@EDjur -- Thank you, merged this. Could we update 'google-cloud-videointelligence>=1.8.0<=1.12.1', to use the recently released 1.13.0 version as well? Do we need anything special? |
|
Looks fine to update! The 1.13.0 essentially added two new enums to the types.Feature, but since these are specified by the user, all should be well. I also manually reran all the tests on version 1.13.0 without mocking the videointelligence client and they ran as expected. Cheers! |
This PR relates to https://issues.apache.org/jira/browse/BEAM-9146 and integrates GCP Video Intelligence functionality as part of a PTransform that implicitly accepts a PCollection of GCS URIs or raw bytes.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.