If you are submitting a bug report, please fill in the following details and use the tag [bug].
Describe the bug
When calling HookedTranformer.generate() without initializing the model with a tokenizer and having the default use_past_kv_cache=True, I am getting an error 'NoneType' object has no attribute 'pad_token_id'. I believe this is because, when use_past_kv_cache=True, the library needs to determine the pad tokens from an input, so it assumes mode.tokenizer is also set so it can the pad token id. However, in my use case, I am not using a tokenizer for my model. Maybe there can be a way to manually supply the pad_token_id in this case?
Code example
from transformer_lens import HookedTransformer, HookedTransformerConfig
import torch as t
cfg = HookedTransformerConfig(
n_layers=1,
n_heads=1,
d_model=4,
d_vocab=4,
n_ctx=4,
d_head=2,
act_fn='relu'
)
model = HookedTransformer(cfg)
input = t.tensor([[0, 1, 2, 3], [0, 1, 2, 3]], dtype=t.long)
model.generate(input, eos_token_id=0)
System Info
Describe the characteristic of your environment:
Linux
transformer_lens version 1.12.0
Additional context
Add any other context about the problem here.
Checklist
If you are submitting a bug report, please fill in the following details and use the tag [bug].
Describe the bug
When calling
HookedTranformer.generate()without initializing the model with a tokenizer and having the defaultuse_past_kv_cache=True, I am getting an error 'NoneType' object has no attribute 'pad_token_id'. I believe this is because, whenuse_past_kv_cache=True, the library needs to determine the pad tokens from an input, so it assumesmode.tokenizeris also set so it can the pad token id. However, in my use case, I am not using a tokenizer for my model. Maybe there can be a way to manually supply the pad_token_id in this case?Code example
System Info
Describe the characteristic of your environment:
Linux
transformer_lens version 1.12.0
Additional context
Add any other context about the problem here.
Checklist