Skip to content

LOG-postech/rethinking-LLM-pruning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization

This repository contains PyTorch source code for EMNLP 2024 paper Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization.

Our implementation is based on EBFT, Wanda, SparseGPT, and LLM-QAT.

Environments

Python

  • python 3.9

Dependencies

pip3 install torch torchvision torchaudio
pip install -r requirements.txt

Usage

Basic usage (for LLaMA)

LR

python main.py --config=./configs/llama.py --config.epochs=0

BR

python main.py --config=./configs/llama.py

BR + GP

python main.py --config=./configs/llama.py --config.use_gp=True

BR + GP + CR

python main.py --config=./configs/llama.py --config.use_gp=True --config.use_cr=True

OPT model

python main.py --config=./configs/opt.py

Self-generated data

First, generate the data as follows.

python generate_data.py --config=./configs/data.py

Then, set config.self_nsamples to be a positive number.

python main.py --config=./configs/llama.py --config.self_nsamples=256

Zero-shot performance evaluation

First, download the directory from the link provided from the Wanda repository. Next, change the directory name to lm_eval. Then, set config.eval_zero_shot as True.

python main.py --config=./configs/llama.py --config.eval_zero_shot=True

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages