Skip to content

[Question] Load model with supporting structure but different size #929

@xinranhe

Description

@xinranhe

Question

I have trained a GPT-2 structure model that is much smaller than gpt2 defined in official model. I would like to load it as HookTransformer and reuse the code to map the weight matrices. I am currently doing it in a hacky way by overwriting the HookedTransformerConfig cfg in get_pretrained_model_config with the HF config cfg that I got from my model. Just wonder is there any better way to achieve this without need to change code? Or if this is a common need, could it be supported? Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions