Skip to content

feat: Add Bitsandbytes quantization for transformer backend #1775

@fakezeta

Description

@fakezeta

Is your feature request related to a problem? Please describe.

Quantization is not available for transformer backend
Describe the solution you'd like

Add bitsandbytes 4bit quantization to be triggered by the user with the low_vram flag in the model definition
Additionally I propose to use the f16 flag to change the compute_dtype to bfloat16 for better optimization on Nvidia cards
Describe alternatives you've considered

Additional context

I've implemented this while fixing #1774
Issue opened for tracking

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions