Skip to content

feat: add OpenAI CircuitGPT core architecture and sparse linear layers#44184

Open
mariam851 wants to merge 2 commits intohuggingface:mainfrom
mariam851:add-circuit-gpt
Open

feat: add OpenAI CircuitGPT core architecture and sparse linear layers#44184
mariam851 wants to merge 2 commits intohuggingface:mainfrom
mariam851:add-circuit-gpt

Conversation

@mariam851
Copy link
Copy Markdown
Contributor

This PR implements the initial architecture for CircuitGPT (based on OpenAI's research), as discussed in #44121.

Key implementations:

SparseLinear: Custom layer with Top-K weight sparsity logic.

CircuitGpt Components: Attention, MLP, and CausalLM classes.

Configuration: CircuitGptConfig for easy sparsity control.

Verification:
*

Verified forward pass and sparsity masking via pytest (2/2 tests passed).

Fixes #44121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Model Request] Add OpenAI Weight-Sparse Transformer (circuit-sparsity / circuitgpt)

1 participant