Add BridgeTower model#20775
Conversation
Add bridgetower.pr review8
Fixes for BridgeTowerVisionEmbeddings
Fixes for BridgeTowerVisionEmbeddings
NielsRogge
left a comment
There was a problem hiding this comment.
Thanks a lot for working on this and addressing all comments!
There are still 2 comments which seem to be unaddressed, after that good for me to merge.
|
@NielsRogge Our PR keeps failing at tests/pipelines/test_pipelines_automatic_speech_recognition.py::AutomaticSpeechRecognitionPipelineTests::test_return_timestamps_in_preprocess. Would you please help to see if it is because of BridgeTower or because of something else? |
Synchronize
…add_bridgetower_model
Synchronize with HF
…add_bridgetower_model
|
@abhiwand @tileintel Thanks for address all of the comments! On Monday there were two PRs merged into main which added |
…cessing_common.py
|
@amyeroberts We have updated test_image_processing_bridgetower.py as you suggested. Thanks for the suggestion. |
|
Thanks again for your contribution! |
|
@sgugger Thank you for merging this PR. May I ask when BridgeTower model will go to HuggingFace's production and what release is that? |
|
The next release will be in a month roughly (given the fast last release was yesterday). |
|
Thank @sgugger for letting us know. |
What does this PR do?
This PR implements a HuggingFace Transformers version of BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning from the paper https://arxiv.org/abs/2206.08657.pdf
This paper has been accepted to https://aaai.org/Conferences/AAAI-23/
The model's pre-trained checkpoints and configurations have been released here:
https://huggingface.co/BridgeTower under:
The following heads have been implemented:
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
@amyeroberts @NielsRogge @ArthurZucker could you please assist with review and feedback.
@philschmid