Skip to content

Conversation

@gopidesupavan
Copy link
Member

Adding Amazon Comprehend document classifier. Doc, Operator, Sensor, Trigger, Waiter, Unit Test, System Test.

Manually tested in Breeze with

wait_for_completion=False with a Sensor
deferrable=True.
wait_for_completion=True

For the system test, I used two documents from AWS samples and created multiple copies. Since the classifier requires a minimum of 10 documents for training for each label. I've observed that it takes a maximum of 10 to 15 minutes to train the classifier, given the limited number of labels and documents. This is the minimum setup I was able to get running, so it can be executed in the daily system test suite.

image

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@gopidesupavan gopidesupavan changed the title Add comprehend document classifier Add Amazon Comprehend Document Classifier Jun 17, 2024
Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2 comments but overall fantastic work!!

@gopidesupavan
Copy link
Member Author

Just 2 comments but overall fantastic work!!

Thank you 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants