Skip to content

[Proposal] Update Benchmarks & Verify Models to support Quantized models #1275

@jlarson4

Description

@jlarson4

Proposal

Update our Benchmarking and Verification systems to properly load Quantized models.

Motivation

At present, we skip all quantized models when doing verification or benchmarking with a note TransformerLens does not support quantized models at this time. This is not strictly true. The Benchmark/Verification systems don't support Quantized model loading, but TransformerLens itself can run them as show in the LLaMA Quantized demo notebook.

Pitch

We need main_benchmark & verify_models to properly identify quantized models, make sure their dependencies are loaded, and properly load the models. If the proper dependencies aren't loaded, ideally it exits and informs the user what dependencies need installation before continuation. If the proper dependencies are loaded, it should match the LLaMA Quantized demo method for loading quantized models and run any benchmarks it can for that model.

Alternatives

As an alternative stop-gap, we should at a minimum update the note for quantized models to say TransformerLens cannot benchmark quantized models at this time.

Checklist

  • I have checked that there is no similar issue in the repo (required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    complexity-moderateModerately complicated issues for people who have intermediate experience with the codeenhancementNew feature or requesthelp wantedExtra attention is neededminorRelease a minor version

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions