Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization

This repository contains PyTorch source code for EMNLP 2024 paper Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization.

Our implementation is based on EBFT, Wanda, SparseGPT, and LLM-QAT.

Environments

Python

python 3.9

Dependencies

pip3 install torch torchvision torchaudio
pip install -r requirements.txt

Usage

Basic usage (for LLaMA)

LR

python main.py --config=./configs/llama.py --config.epochs=0

BR

python main.py --config=./configs/llama.py

BR + GP

python main.py --config=./configs/llama.py --config.use_gp=True

BR + GP + CR

python main.py --config=./configs/llama.py --config.use_gp=True --config.use_cr=True

OPT model

python main.py --config=./configs/opt.py

Self-generated data

First, generate the data as follows.

python generate_data.py --config=./configs/data.py

Then, set config.self_nsamples to be a positive number.

python main.py --config=./configs/llama.py --config.self_nsamples=256

Zero-shot performance evaluation

First, download the directory from the link provided from the Wanda repository. Next, change the directory name to lm_eval. Then, set config.eval_zero_shot as True.

python main.py --config=./configs/llama.py --config.eval_zero_shot=True

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization

Environments

Python

Dependencies

Usage

Basic usage (for LLaMA)

OPT model

Self-generated data

Zero-shot performance evaluation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
lib		lib
.gitignore		.gitignore
README.md		README.md
generate_data.py		generate_data.py
llama.py		llama.py
main.py		main.py
opt.py		opt.py
requirements.txt		requirements.txt

LOG-postech/rethinking-LLM-pruning

Folders and files

Latest commit

History

Repository files navigation

Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization

Environments

Python

Dependencies

Usage

Basic usage (for LLaMA)

OPT model

Self-generated data

Zero-shot performance evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages