Skip to content

[Bug Report] "unsloth/llama-3.2-3b-instruct" with 2x 3060 throws RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cuda:1) #968

@jasonrichdarmawan

Description

@jasonrichdarmawan

If you are submitting a bug report, please fill in the following details and use the tag [bug].

Describe the bug
My requirement is to load unsloth/llama-3.2-3b-instruct (drop-in replacement to meta-llama/llama-3.2-3b-instruct because my request was rejected) from a local folder because I put the model weights on a NFS

Code example

# %%

model_name = "Llama-3.2-3B-Instruct"
print(f"Loading {model_name} model")
model_path = os.path.join(
  "/media/tao/disk4T/jason/transformers",
  f"unsloth/{model_name}",
)
hf_model = AutoModelForCausalLM.from_pretrained(
  model_path,
)
tokenizer = AutoTokenizer.from_pretrained(
  model_path,
)

# %%

model = HookedTransformer.from_pretrained(
  model_name=f"meta-llama/{model_name}",
  hf_model=hf_model,
  tokenizer=tokenizer,
  device="cuda",
  n_devices=2,
)

model.generate("hello world", max_new_tokens=200, do_sample=True, temperature=0.3)

System Info
Describe the characteristic of your environment:

  • Describe how transformer_lens was installed: pip
  • What OS are you using? Linux
  • Python version: 3.11

Additional context
Add any other context about the problem here.

Checklist

  • I have checked that there is no similar issue in the repo (required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions