Question
I have trained a GPT-2 structure model that is much smaller than gpt2 defined in official model. I would like to load it as HookTransformer and reuse the code to map the weight matrices. I am currently doing it in a hacky way by overwriting the HookedTransformerConfig cfg in get_pretrained_model_config with the HF config cfg that I got from my model. Just wonder is there any better way to achieve this without need to change code? Or if this is a common need, could it be supported? Thanks!
Question
I have trained a GPT-2 structure model that is much smaller than gpt2 defined in official model. I would like to load it as HookTransformer and reuse the code to map the weight matrices. I am currently doing it in a hacky way by overwriting the HookedTransformerConfig cfg in get_pretrained_model_config with the HF config cfg that I got from my model. Just wonder is there any better way to achieve this without need to change code? Or if this is a common need, could it be supported? Thanks!