Skip to content

Lora target modules #6

@probablyabot

Description

@probablyabot

FYI, the Lora target modules in https://github.com/dllm-reasoning/d1/blob/main/diffu-grpo/diffu_grpo_train.py (which seem to be taken from Llama 3) are incorrect. LlaDA uses a different naming convention for their blocks, so the correct modules to target are ["q_proj", "k_proj", "v_proj", "attn_out", "ff_proj", "up_proj"]. (For example, see the LlaDA HF source code.)

peft doesn't throw an error because q_proj, k_proj, v_proj are valid targets, so it just silently ignores the rest. Changing to the correct modules results in an extra ~50% trainable parameters.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions