Lora target modules

FYI, the Lora target modules in https://github.com/dllm-reasoning/d1/blob/main/diffu-grpo/diffu_grpo_train.py (which seem to be taken from Llama 3) are incorrect. LlaDA uses a different naming convention for their blocks, so the correct modules to target are ["q_proj", "k_proj", "v_proj", "attn_out", "ff_proj", "up_proj"]. (For example, see the LlaDA HF [source code](https://huggingface.co/GSAI-ML/LLaDA-8B-Instruct/blob/main/modeling_llada.py).)

`peft` doesn't throw an error because q_proj, k_proj, v_proj are valid targets, so it just silently ignores the rest. Changing to the correct modules results in an extra ~50% trainable parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lora target modules #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Lora target modules #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions