Skip to content

jyzhang2002/TaMAS-TextClass

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Official implementation of "Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?".

Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?

📢 Updates

  • August, 2025: Paper accepted by EMNLP Findings!

This repository contains the official implementation of the following paper:

Title: Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?
Link: https://arxiv.org/abs/2505.18215

示例图片

Abstract

The rapid adoption of LLMs has overshadowed the potential advantages of traditional BERT-like models in text classification. This study challenges the prevailing "LLM-centric" trend by systematically comparing three categories of methods, i.e., BERT-like model fine-tuning, LLM internal state utilization, and LLM zero-shot inference across six challenging datasets. Our findings reveal that BERT-like models often outperform LLMs. We further categorize datasets into three types, perform PCA and probing experiments, and identify task-specific model strengths: BERT-like models excel in pattern-driven tasks, while LLMs dominate those requiring deep semantics or world knowledge. Subsequently, we conducted experiments on a broader range of text classification tasks to demonstrate the generalizability of our findings. We further investigated how the relative performance of different models varies under different levels of data availability. Finally, based on these findings, we propose TaMAS, a fine-grained task selection strategy, advocating for a nuanced, task-driven approach over a one-size-fits-all reliance on LLMs.

Installation

conda create -n <env_name> python=3.10
pip install -r requirements.txt

Dataset Preparation

Download the following datasets and place them in a directory named dataset. We provide the pre-split datasets ToxiCloakCNBase, ToxiCloakCNEmoji, and ToxiCloakCNHomo, which can be used directly. For the other datasets, download them using the links provided in the paper and organize them into a similar format.

Compared Methods

Run the following commands to execute the BERT-like models approaches:

python3 -m pipeline.bert_finetune  
python3 -m pipeline.electra_finetune  
python3 -m pipeline.ernie_finetune

Run the following commands to execute the LLM internal state utilization approaches:

python3 -m pipeline.run_all_saplma  
python3 -m pipeline.run_all_mm

Run the following command to perform LLM zero-shot inference:

python3 -m pipeline.llm_ask

For English datasets, please remember to replace the models with their English counterparts as specified in the paper.

Visualization

If visualization as presented in the paper is required, internal representations can be collected during training, and the code provided in the visualization/ directory can be used for generating visualizations.

Cite

@article{zhang2025bert,
  title={Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?},
  author={Zhang, Junyan and Huang, Yiming and Liu, Shuliang and Gao, Yubo and Hu, Xuming},
  journal={arXiv preprint arXiv:2505.18215},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages