KANBERT is an advanced model that integrates Kolmogorov-Arnold Networks (KANs) into the BERT framework. This repository provides the codebase to reproduce the experiments and results described in our work.
For a detailed overview of the experimentation process, refer to the kanbert.pdf file.
This project includes:
main.py: A predefined script to execute all the experiments as defined inexperiments.py.kan_bert.py: Contains the implementation of the KAN layers integrated into the BERT model using PyTorch.train.py: Defines the training class to set up and run training for a single experiment with default parameters.evaluation.py: Contains the evaluation class to run the same evaluation as in the experiments for a single experiment.experiments.py: Defines experiments for testing KANBERT and BERT-baseline models.
Ensure you have Python 3.9 installed. You can install the required Python packages using the requirements.txt file:
pip install -r requirements.txtThe models used in the predefined experiments were trained on systems equipped with A100 GPUs, each with up to 40GB of GPU RAM.
This project is licensed under the MIT License. See the LICENSE file for details.