[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor#10594
Conversation
| class BatchFeature(UserDict): | ||
| r""" | ||
| Holds the output of the :meth:`~transformers.PreTrainedFeatureExtractor.pad` and feature extractor specific | ||
| Holds the output of the :meth:`~transformers.PreTrainedSequenceFeatureExtractor.pad` and feature extractor specific |
There was a problem hiding this comment.
@NielsRogge, here you can just overwrite the docstring by
of the :meth: .... or the :meth:`~transformers.PreTrainedImageFeatureExtractor.pad`
| Examples:: | ||
|
|
||
| # We can't instantiate directly the base class `PreTrainedFeatureExtractor` so let's show the examples on a | ||
| # We can't instantiate directly the base class `PreTrainedSequenceFeatureExtractor` so let's show the examples on a |
There was a problem hiding this comment.
Examples for ImageFeatureExtractor saving/loading can be appended to the Examples here
LysandreJik
left a comment
There was a problem hiding this comment.
Looks good! I like the API. Something I would change to streamline it with our tokenizers is to have the pad and _pad methods defined in the superclass, but raise a NotImplementedError if they're not implemented. Similarly to tokenize() in the tokenization utils base.
Also, here you imported PaddingStrategy which is great so as to not duplicate existing objects, how would we manage this for vision models? Since the padding/cropping strategies would probably be different (i.e., largest instead of longest, as @NielsRogge was mentioning)
…into add_processing_save_load_utils
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
sgugger
left a comment
There was a problem hiding this comment.
Looks good to me, thanks for working on this!
As I said offline, I don't like the very long names for the new classes and the module names so we should strive to find something easier.
…ggingface#10594) * save first version * finish refactor * finish refactor * correct naming * correct naming * shorter names * Update src/transformers/feature_extraction_common_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change name * finish Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
What does this PR do?
This PR refactors the class
PreTrainedFeatureExtractor. The following changes are done to move functionality that is shared between sequence and image feature extractors into a separate file. This should unblock the PRs of DETR, VIT, and CLIPPreTrainedFeatureExtractoris renamed toPreTrainedSequenceFeatureExtractorbecause it implicitly assumed that the it will treat only sequential inputs (a.k.a sequence of float values or a sequence of float vectors).PreTrainedFeatureExtractorwas too generalFeatureExtractorSavingUtilsMixinBatchFeatureis moved from thefeature_extraction_sequence_utils.pytofeature_extraction_common_utils.pyto be used by thePreTrainedImageFeatureExtractorclass as wellThe following things were assumed before applying the changes.
VITFeatureExtractor, .... IMO, feature extractor is also a fitting name for image recognition (see: https://en.wikipedia.org/wiki/Feature_extraction) so that it is assumed that for image-text or image-only models there will be aPreTrainedImageFeatureExtractor, aVITFeatureExtractor, (and maybe a VITTokenizer & VITProcessor as well, but not necessary). For vision-text models that do require both a tokenizer and a feature extractor such as CLIP it is assumed that the classesCLIPFeatureExtractorandCLIPTokenizerare wrapped into aCLIPProcessorclass similar toWav2Vec2Processor. I think this is the most important assumption that is taken here, so we should make sure we are on the same page here @LysandreJik @sgugger @patil-suraj @NielsRoggeBatchImageFeatureorBatchImage, but can just useBatchFeature. From looking at the code in @NielsRogge's PR here: Add Vision Transformer + ViTFeatureExtractor #10513 this seems to be the case.Backwards compatibility:
The class
PreTrainedFeatureExtractorwas accessible via:but is now replaced by
PreTrainedSequenceFeatureExtractor. However, sincePreTrainedFeatureExtractorso far was only available on master, this change is OK IMO.