Transformer Classifier

1. Overview

This repository contains two NLP classification models built with the Hugging Face Transformers library. The project focuses on two distinct domains:

Adversarial Prompt Security (binary classification)
Scientific Text Classification (multiclass classification)

Both projects are unified by a common pipeline of Transformer-based classification and data augmentation.

Table of Contents

Overview
Quick Start Guide
1. Local
2. Demo
3. Tests
4. Documentation
5. Project Structure
6. Conventions
Project Description
Project Extension and Future Work

2. Quick Start Guide

2.1. Local

To start locally, first ensure you have just and uv installed. If you don't, run the following OS-specific commands:

MacOS:

brew install just uv

Linux (Debian/Ubuntu):

sudo apt-get update
sudo apt-get install -y just
curl -LsSf https://astral.sh/uv/install.sh | sh
# then restart your shell so uv is on PATH

Windows:

# uv (official installer)
irm https://astral.sh/uv/install.ps1 | iex

# just — pick one package manager you support in your project:
# winget (preferred if available)
winget install casey.just -e  # if this ID doesn't resolve on some systems, use one of the following lines
# scoop
scoop install just
# chocolatey
choco install just

Then, install the dependencies and activate the virtual environment by running:

just install
source .venv/bin/activate

2.2. Demo

TBA

2.3. Tests and CI

To run the tests, make sure you have the virtual environment activated and run:

python -m pytest

To check coverage, run:

python -m pytest --cov=src --cov-fail-under=90 --cov-report=term-missing

CI is configured in .github/workflows/ci.yml and is intentionally PR-focused. It runs for open, reopened, synchronised, and ready-for-review pull requests. Draft pull requests are ignored until they are marked as ready. Dependency installation in CI uses uv sync --group dev --frozen to enforce lockfile reproducibility.

CI pipeline stages (in execution order):

Check PR Commit Policy
- Fails if the PR has anything other than exactly one commit.
- Fails if commit messages start with fixup! or squash!.
- Keeps PR history clean before merge.
Pre-commit Checks
- Runs all hooks from .pre-commit-config.yaml.
- Enforces formatting, linting, and lightweight safety checks.
Type Check (Pyright)
- Runs static type checks with pyright.
- Catches interface/typing issues before runtime tests.
Smoke Tests
- Runs the smoke marker subset (pytest -m smoke).
- Provides a fast runtime sanity check before full tests.
Pytest (Python 3.11)
- Runs the full test suite and enforces a minimum coverage of 90%.
- Uploads coverage.xml as a workflow artifact for inspection.
Docs Build
- Runs mkdocs build --strict.
- Fails the PR if documentation pages, links, or API autodoc references are invalid.
Dependency Vulnerability Audit (Non-blocking)
- Runs pip-audit against installed dependencies.
- Reports known vulnerabilities in CI logs.
- Is intentionally non-blocking while security posture is being established.
- Runs with if: always() so findings are still emitted when test stages fail.

Security and dependency maintenance is configured with Dependabot in .github/dependabot.yml:

Weekly Python dependency update PRs (from pyproject.toml).
Weekly GitHub Actions version update PRs.

The workflow also uses concurrency cancellation:

When new commits are pushed to the same PR, in-progress older runs are cancelled.
This avoids stale CI feedback and reduces consumed GitHub Actions minutes.

Branch protection/ruleset alignment:

Require a pull request before merging.
Required approvals: 0 (solo workflow), while keeping code-owner and conversation rules.
Require review from Code Owners.
Require conversation resolution before merging.
Require status checks to pass (must be enabled), with required checks:
Check PR Commit Policy
Pre-commit Checks
Type Check (Pyright)
Smoke Tests
Pytest (Python 3.11)
Docs Build
Keep Dependency Vulnerability Audit (Non-blocking) as informational for now, rather than as a required blocking status check.
Block force pushes.
Require linear history.
Allow squash merging (and optional rebase merging), with merge commits disabled.

2.4. Documentation

Project documentation is built with MkDocs Material and published to GitHub Pages. The site combines hand-written guides from docs/ and API reference pages generated from in-code Google-style docstrings.

Local docs commands:

just docs-build
just docs-serve

Documentation workflows:

PRs run a strict build (uv run mkdocs build --strict) as a blocking CI gate.
Pushes to main trigger .github/workflows/docs-publish.yml to deploy to GitHub Pages.

GitHub repository settings needed once:

Settings -> Pages -> Build and deployment -> Source: GitHub Actions.
Branch protection/ruleset -> required status checks: include Docs Build.

2.5. Project Structure

The project structure can be seen below, with files having the following roles:

Folder	File	Description
...	...	...

2.6. Conventions

TBA

3. Project Description

TBA

3.1. Models

3.1.1. Adversarial Prompt Classifier

Designed to detect prompt injections and jailbreak attempts (e.g., "ignore previous instructions", "DAN", roleplay).

Backbone: ...
Focus: ...
Techniques: ...

3.1.2. Scientific Text Classifier

Classifies text into scientific/technical categories versus general content.

Backbone: ...
Focus: ...
Techniques: ...

4. Project Extension and Future Work

TBA

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
docs		docs
notebooks		notebooks
src/TransformerClassifiers		src/TransformerClassifiers
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
justfile		justfile
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer Classifier

1. Overview

2. Quick Start Guide

2.1. Local

2.2. Demo

2.3. Tests and CI

2.4. Documentation

2.5. Project Structure

2.6. Conventions

3. Project Description

3.1. Models

3.1.1. Adversarial Prompt Classifier

3.1.2. Scientific Text Classifier

4. Project Extension and Future Work

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

AndreiRoibu/Transformer-Classifiers

Folders and files

Latest commit

History

Repository files navigation

Transformer Classifier

1. Overview

2. Quick Start Guide

2.1. Local

2.2. Demo

2.3. Tests and CI

2.4. Documentation

2.5. Project Structure

2.6. Conventions

3. Project Description

3.1. Models

3.1.1. Adversarial Prompt Classifier

3.1.2. Scientific Text Classifier

4. Project Extension and Future Work

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages