Update Model Card for Mamba#37863
Conversation
|
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the |
stevhliu
left a comment
There was a problem hiding this comment.
Thanks!
Maybe let's also add a quantization example with https://huggingface.co/state-spaces/mamba-2.8b-hf.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TorchAoConfig
quantization_config = TorchAoConfig("int4_weight_only", group_size=128)
tokenizer = AutoTokenizer.from_pretrained("state-spaces/mamba-2.8b-hf")
model = AutoModelForCausalLM.from_pretrained("state-spaces/mamba-2.8b-hf", torch_dtype=torch.bfloat16, quantization_config=quantization_config, device_map="auto",)
input_ids = tokenizer("Plants create energy through a process known as", return_tensors="pt").to("cuda")
output = model.generate(**input_ids)
print(tokenizer.decode(output[0], skip_special_tokens=True))Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
I've added this example and updated it as per the new AOBaseConfig-based approach since the approach used above is deprecated as mentioned in the docs. Could you please review the changes? cc: @stevhliu |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
As described in the issue, this PR updates the model card for Mamba. Please let me know if any modifications are required and I will make the necessary changes.
Refs #36979
Before submitting
Who can review?
@stevhliu