Skip to content

Kernel-ML/noisyllm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

noisyllm

Toolkit for robust LLM fine-tuning on noisy training data.

Companion open-source implementation for the paper:

Fine-Tuning LLMs for Robust Classification in Noisy Data Environments Journal of Information Systems Engineering and Management (JISEM), 2024, 9(1) e-ISSN: 2468-4376

Overview

Real-world training data is rarely clean. Mislabeled examples, inconsistent annotations, and sparse data are common in production environments such as e-commerce search, financial services, and customer support. This library provides the tooling to detect noise in classification datasets, clean and augment training data, evaluate model robustness across noise types and levels, and reproduce the benchmark experiments from the paper.

Modules

Module Purpose
noisyllm.detect Cross-validation confidence scoring to flag likely mislabeled examples
noisyllm.clean Filter high-confidence noise with configurable thresholds
noisyllm.train Pydantic configs for noise-robust training (label smoothing, curriculum learning)
noisyllm.eval Robustness evaluator across noise types and levels
noisyllm.benchmark Synthetic noisy benchmark datasets for reproducible comparison

Installation

pip install noisyllm

Or with UV:

uv add noisyllm

Quick Start

from noisyllm.detect import NoiseProfiler
from noisyllm.clean import DataCleaner
from noisyllm.eval import RobustnessEvaluator

# Detect noise in a labeled dataset
profiler = NoiseProfiler(text_col="text", label_col="label", confidence_threshold=0.3)
report = profiler.analyze(dataset)
print(report.summary())
# Dataset: 5000 samples
# Estimated noise rate: 8.1%
# Flagged samples: 407

# Clean the dataset by removing high-confidence mislabels
cleaner = DataCleaner(noise_report=report, filter_threshold=0.7)
result = cleaner.clean(dataset)
print(result.summary())
# Original: 5000 | Filtered: 312 | Final: 4688

# Evaluate robustness of a trained classifier
evaluator = RobustnessEvaluator(predict_fn=model.predict)
eval_report = evaluator.evaluate(
    clean_test=test_dataset,
    noise_levels=[0.05, 0.10, 0.15, 0.20],
    noise_types=["label_flip", "text_corruption"],
)
print(eval_report.summary())
# Base accuracy (clean): 94.2%
# Robustness index: 0.91

Benchmarks

from noisyllm.benchmark import load_benchmark

dataset = load_benchmark("intent_classification", noise_level=0.10)
# Returns: BenchmarkDataset with train (noisy), test (clean), label_set

Available benchmarks: intent_classification, sentiment, document

Development

uv sync --all-extras
uv run pytest tests/ -v --cov=src
uv run isort src/ tests/ && uv run black src/ tests/

Citation

If you use this library in your research, please cite the paper:

Fine-Tuning LLMs for Robust Classification in Noisy Data Environments.
Journal of Information Systems Engineering and Management (JISEM), 2024, 9(1).
e-ISSN: 2468-4376

License

Apache 2.0

About

Toolkit for robust LLM fine-tuning on noisy training data.

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages