MedChat

A fined-tuned LLM (Llama Model) on a medical conversation dataset using PEFT technique called QLoRA and SFT

Model Link : https://huggingface.co/yuktasarode/Llama-2-7b-chat-finetune/tree/main

Dataset Link : https://huggingface.co/datasets/lavita/ChatDoctor-HealthCareMagic-100k

Metrics:

BLEU score: 89.86

METEOR score: 0.94

Perplexity: 2.69

ROUGE-1: 0.95

ROUGE-2: 0.92

ROUGE-L: 0.95

1. Freezing Base Model Layers

In the provided code, the base model is loaded using:

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map=device_map
)
model.config.use_cache = False
model.config.pretraining_tp = 1

This initializes the pre-trained model (e.g., Llama-2) in a frozen state. The parameters of the base model are not updated during training to:

Preserve the pre-trained knowledge.
Reduce computational overhead and memory requirements.

2. Adding LoRA Layers

The lightweight LoRA layers are added to specific parts of the model (e.g., attention layers). These layers are parameterized with:

Low-rank matrices (dimensionality controlled by lora_r).
A scaling factor (lora_alpha) that balances the LoRA contributions. This is done using:

peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    bias="none",
    task_type="CAUSAL_LM",
)

The LoraConfig defines how LoRA modifies the model's layers. The trainable LoRA layers are injected into the frozen model to adapt it to the new dataset without updating the base parameters.

3. Loading Dataset and Tokenizer

The dataset and tokenizer are loaded and configured for training:

dataset = load_dataset(dataset_name, split="train")

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

The dataset is tokenized using the pre-trained tokenizer, ensuring compatibility with the base model.

4. Setting Training Parameters

The training parameters are specified in TrainingArguments:

training_arguments = TrainingArguments(
    output_dir=output_dir,
    num_train_epochs=num_train_epochs,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type,
    report_to="tensorboard"
)

Notable configurations include:

Gradient accumulation for larger batch sizes.
Gradient checkpointing to save memory.
Specific optimizers and learning rates suitable for fine-tuning.

5. Using the Trainer

The SFTTrainer orchestrates the fine-tuning:

trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    tokenizer=tokenizer,
    args=training_arguments,
    packing=packing,
)

trainer.train()

The peft_config is applied to inject LoRA layers into the frozen model. Only the parameters of these LoRA layers are updated during training. The dataset and tokenizer are used to create batches for training.

Summary of Key Steps

The base model is frozen to retain its pre-trained knowledge.
LoRA layers are added to the model, which are small and efficient to train.
Training only updates LoRA layers while the base model remains unchanged.
Efficient optimization techniques (e.g., QLoRA, gradient accumulation) are employed to reduce computational costs.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Fine_tune_Llama_on_medical.ipynb		Fine_tune_Llama_on_medical.ipynb
La_med.ipynb		La_med.ipynb
Peft.ipynb		Peft.ipynb
README.md		README.md
llama_on_plain.ipynb		llama_on_plain.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MedChat

1. Freezing Base Model Layers

2. Adding LoRA Layers

3. Loading Dataset and Tokenizer

4. Setting Training Parameters

5. Using the Trainer

Summary of Key Steps

About

Uh oh!

Releases

Packages

Languages

yuktasarode/MedChat

Folders and files

Latest commit

History

Repository files navigation

MedChat

1. Freezing Base Model Layers

2. Adding LoRA Layers

3. Loading Dataset and Tokenizer

4. Setting Training Parameters

5. Using the Trainer

Summary of Key Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages