I am fine-tuning the ModernBERT model for a classification task and now need to quantize and convert it to ONNX. I tried using the Hugging Face Optimum library, but it does not currently support ModernBERT.
I noticed that quantized models are available in the ModernBERT's Hugging Face repository. Could you please share the code or steps used to quantize and convert these models to ONNX?
I am fine-tuning the ModernBERT model for a classification task and now need to quantize and convert it to ONNX. I tried using the Hugging Face Optimum library, but it does not currently support ModernBERT.
I noticed that quantized models are available in the ModernBERT's Hugging Face repository. Could you please share the code or steps used to quantize and convert these models to ONNX?