Skip to content

FEAT: Threshold and ratio configuration and testing file for optimal threshold and ratio configuration#21

Open
rafainn wants to merge 21 commits into
Roblox:mainfrom
rafainn:Threshold-and-ratio-configuration
Open

FEAT: Threshold and ratio configuration and testing file for optimal threshold and ratio configuration#21
rafainn wants to merge 21 commits into
Roblox:mainfrom
rafainn:Threshold-and-ratio-configuration

Conversation

@rafainn
Copy link
Copy Markdown

@rafainn rafainn commented Aug 18, 2025

Built upon pull request #7

This pull request introduces significant improvements to the Sentinel library, focusing on aggregation flexibility, explainability, and performance optimizations. The README is updated to document new aggregation strategies and explainability features, and the codebase now exposes multiple aggregation functions for scoring, adds per-text explanations, and improves model caching and negative sample ratio handling.

Aggregation and Explainability Enhancements:

  • Added multiple aggregation strategies (skewness, top_k_mean, percentile_score, softmax_weighted_mean, max_score) for combining observation scores, with documentation and usage examples in README.md. [1] [2] [3] [4]
  • Introduced per-text explainability in results, including top-K positive/negative similarities, contrastive components, and neighbor snippets, as shown in the updated RareClassAffinityResult dataclass and README usage examples. [1] [2]

Performance and Robustness Improvements:

  • Implemented global caching for SentenceTransformer models in src/sentinel/embeddings/sbert.py to avoid redundant loading, with cache management utilities. Global caching seems to have reduced load time of ~300 conversations down to 3.5s from the previous 12.3s.
  • This includes an input Cache_Model in the calculate_rare_class_affinity model in src\sentinel\sentinel_local_index.py to enable and disable caching easily, depending on space constraints and model requirement
  • Improved handling of negative-to-positive ratio when loading indices, including error handling and preserving original ratios when needed. [1] [2] [3]
  • Created a detailed testing file for changes and how it effects performance test_thresholds_and_ratios in examples/Example_Threshold_Script.py and how different ratios and temperatures affect detection, this shows a high relation with using 0.00 and 0.01 temperature, and ratios of 2-4:1 for optimal accuracy and minimal false positives.

API and Documentation Updates:

  • Updated __init__.py to expose new aggregation functions in the public API.
  • Enhanced documentation and comments for scoring functions and result types, clarifying their purpose and usage. [1] [2]

These changes collectively make Sentinel more configurable, interpretable, and efficient for diverse deployment scenarios.

ch1kim0n1 and others added 4 commits August 15, 2025 12:44
…SentinelLocalIndex.

FEAT: created a testing tool for best threshold and ratio analysis
…e with optional flags.

DOCS: Updated relevent documentation with these fixes
@rafainn
Copy link
Copy Markdown
Author

rafainn commented Aug 18, 2025

Note: All 20 tests passed with two warnings regarding configuration of the pytests as follows, unsure if this is due to an outdated version of the pytest library, or if these config keys have depreciated.
".venv\Lib\site-packages_pytest\config_init_.py:1441
D:\PROGRAMMING HHD\Sentinel\Sentinel.venv\Lib\site-packages_pytest\config_init_.py:1441: PytestConfigWarning: Unknown config option: showlocals

self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")

.venv\Lib\site-packages_pytest\config_init_.py:1441
D:\PROGRAMMING HHD\Sentinel\Sentinel.venv\Lib\site-packages_pytest\config_init_.py:1441: PytestConfigWarning: Unknown config option: verbose

self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html"

rafainn added 14 commits August 18, 2025 20:14
… load time of model exponentially after the first caching.

TESTS: Updated embedding tests to include caching and its management functions
FEAT: Updated the example script for testing purposes to include caching mechanics
- Apply PEP 8 formatting to Example_Threshold_Script.py
- Update embeddings.safetensors
- Update sentinel_against_hate.ipynb
- Fixed line length violations (max 79 characters)
- Corrected indentation and spacing
- Enhanced readability while maintaining functionality
…mpty score arrays, fixes edge case, NaN returns.
…_affinity`, update example file to use path/to/index rather than local path
…omponents in score_formulae and SentinelLocalIndex
…onality, removed redundant exports, added no-cache flag to the testing script
Comment thread src/sentinel/sentinel_local_index.py Outdated
Comment thread src/sentinel/sentinel_local_index.py Outdated
Comment thread src/sentinel/sentinel_local_index.py Outdated
@rafainn
Copy link
Copy Markdown
Author

rafainn commented Feb 19, 2026

@vcai4071 all requested changes requested have been made

@rafainn rafainn requested a review from vcai4071 February 19, 2026 19:45
Copy link
Copy Markdown

@vcai4071 vcai4071 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the great additions!

@leoRblx
Copy link
Copy Markdown
Contributor

leoRblx commented May 6, 2026

@rafainn Can you take a look at the failing test I will merge once the test are passing.
Thx

… have support for PEP 517 builds hence swapped to ^2.0.0 which is compatable - may require further testing however didn't impact functionality of code
@rafainn
Copy link
Copy Markdown
Author

rafainn commented May 7, 2026

@leoRblx Would you be able to run the tests again, this should fix the build issue it was displaying earlier, however I am unsure if there would be any further conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants