Skip to content
@AI-Safety-Institute

AI-Safety-Institute

Popular repositories Loading

  1. sandbagging_auditing_games sandbagging_auditing_games Public

    This repository accompanies the research paper "Sandbagging Auditing Games" on detecting sandbagging in frontier AI systems. We provide access to the model organisms used in the paper and tools for…

    Python 5 1

Repositories

Showing 1 of 1 repositories
  • sandbagging_auditing_games Public

    This repository accompanies the research paper "Sandbagging Auditing Games" on detecting sandbagging in frontier AI systems. We provide access to the model organisms used in the paper and tools for interacting with them, enabling AI safety researchers to reproduce our results and develop novel sandbagging detection techniques.

    AI-Safety-Institute/sandbagging_auditing_games’s past year of commit activity
    Python 5 MIT 1 0 0 Updated Dec 15, 2025

Top languages

Loading…

Most used topics

Loading…