Skip to content

Feature request: BatchL2Grad for LayerNorm #327

@f-dangel

Description

@f-dangel

Documenting this feature request from @mf-silva as supporting per-sample L2 gradient norms for LayerNorm allows estimating importance scores for data points on LLM architectures which often have LayerNorm.
A good starting point to implement this is to take a look at the custom first-order extension example in the docs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions