Proposal
Update our Benchmarking and Verification systems to properly load Quantized models.
Motivation
At present, we skip all quantized models when doing verification or benchmarking with a note TransformerLens does not support quantized models at this time. This is not strictly true. The Benchmark/Verification systems don't support Quantized model loading, but TransformerLens itself can run them as show in the LLaMA Quantized demo notebook.
Pitch
We need main_benchmark & verify_models to properly identify quantized models, make sure their dependencies are loaded, and properly load the models. If the proper dependencies aren't loaded, ideally it exits and informs the user what dependencies need installation before continuation. If the proper dependencies are loaded, it should match the LLaMA Quantized demo method for loading quantized models and run any benchmarks it can for that model.
Alternatives
As an alternative stop-gap, we should at a minimum update the note for quantized models to say TransformerLens cannot benchmark quantized models at this time.
Checklist
Proposal
Update our Benchmarking and Verification systems to properly load Quantized models.
Motivation
At present, we skip all quantized models when doing verification or benchmarking with a note
TransformerLens does not support quantized models at this time. This is not strictly true. The Benchmark/Verification systems don't support Quantized model loading, but TransformerLens itself can run them as show in the LLaMA Quantized demo notebook.Pitch
We need
main_benchmark&verify_modelsto properly identify quantized models, make sure their dependencies are loaded, and properly load the models. If the proper dependencies aren't loaded, ideally it exits and informs the user what dependencies need installation before continuation. If the proper dependencies are loaded, it should match the LLaMA Quantized demo method for loading quantized models and run any benchmarks it can for that model.Alternatives
As an alternative stop-gap, we should at a minimum update the note for quantized models to say
TransformerLens cannot benchmark quantized models at this time.Checklist