Skip to content

An agent to track and analyze mechanistic interpretability of LLM(mainly transformer-curcuits, distill)

Notifications You must be signed in to change notification settings

ChopperLin/mi_agent

Repository files navigation

MI Agent 🧠

Mechanistic Interpretability Research Tracker

Automatically track and analyze the latest progress in Mechanistic Interpretability of Large Language Models.

What is this?

MI Agent aggregates research papers, code repositories, and blog posts about mechanistic interpretability, saving you hours of manual searching and helping you spot trends in the field.

Features

Papers

  • ✅ Automated ArXiv paper collection with MI keywords
  • ✅ Search and filter papers by topic
  • ✅ Daily/weekly digest of new papers

GitHub Repositories

  • ✅ Track MI implementation repos (TransformerLens, SAELens, etc.)
  • ✅ Monitor stars, forks, and activity
  • ✅ Search repos by name or description
  • ✅ See which techniques are actually implemented
  • ✅ Discover new repos via contributor networks

Blog Posts

  • ✅ Track MI research blogs (Anthropic, OpenAI, Alignment Forum, LessWrong)
  • ✅ RSS/Atom feed parsing
  • ✅ Relevance scoring for MI content
  • ✅ Search and filter blog posts

Web Dashboard

  • ✅ Clean web interface for browsing research
  • ✅ Dashboard with stats and recent items
  • ✅ Search and filter all content types
  • ✅ Responsive design
  • ✅ No authentication needed (local use)

Automation & Database

  • ✅ SQLite database for organized storage
  • ✅ Simple CLI with 15+ commands
  • ✅ Beautiful colored output
  • ✅ Automated daily/weekly updates via cron
  • ✅ Digest generation with all sources

Installation

pip install -r requirements.txt

Platform Support:

  • ✅ Linux/macOS: Fully supported
  • ✅ Windows: Fully supported (see WINDOWS.md for setup)
  • ✅ WSL: Fully supported

Optional: GitHub Token (Recommended)

To avoid GitHub API rate limits (60 requests/hour), add a personal access token.

Easy Setup (Recommended):

# Linux/macOS
./setup_github_token.sh

# Windows
setup_github_token.bat

The script will:

  • Guide you through creating a GitHub token
  • Verify the token works
  • Automatically save it to your environment
  • Show you the rate limit increase (60/hour → 5000/hour)

Manual Setup:

# 1. Create token at: https://github.com/settings/tokens
#    - Select: "public_repo" scope only

# 2. Set environment variable
export GITHUB_TOKEN=ghp_your_token_here

# 3. Add to ~/.bashrc for permanent use
echo 'export GITHUB_TOKEN=ghp_your_token_here' >> ~/.bashrc

Rate Limits:

  • Without token: 60 requests/hour (may fail)
  • With token: 5000 requests/hour ✓

Usage

Paper Commands

# Fetch latest papers from ArXiv
python -m mi_agent.cli fetch --days 30

# View recent papers
python -m mi_agent.cli list --days 7

# Search for specific topics
python -m mi_agent.cli search "sparse autoencoders"

# Generate weekly digest
python -m mi_agent.cli digest --days 7

# Show statistics
python -m mi_agent.cli stats

GitHub Repository Commands

# Fetch known MI repositories
python -m mi_agent.cli fetch-repos --known-only

# List repositories by stars
python -m mi_agent.cli list-repos --sort stars

# Search repositories
python -m mi_agent.cli search-repos "TransformerLens"

# Show repo statistics
python -m mi_agent.cli repo-stats

# Discover new repos through social network
python -m mi_agent.cli discover-repos --method contributors

Blog Post Commands

# Fetch blog posts from last 30 days
python -m mi_agent.cli fetch-blogs --days 30

# List recent blog posts
python -m mi_agent.cli list-blogs --limit 10

# Search blog posts
python -m mi_agent.cli search-blogs "sparse autoencoder"

# Show blog statistics
python -m mi_agent.cli blog-stats

Web Dashboard

Browse and search your MI research database with a simple web interface:

# Linux/macOS
./start_web.sh

# Windows
start_web.bat

# Or manually (all platforms)
cd web && python app.py

Then open http://localhost:5006 in your browser.

Features:

  • Dashboard with overview stats and recent items
  • Search papers, repositories, and blog posts
  • Filter by date, source, stars, etc.
  • Clean, responsive interface

Automation

Run all updates automatically:

# Test automation manually
python automate.py

# Or set up cron (see AUTOMATION.md for details)
crontab -e
# Add: 0 9 * * 1 cd /home/user/mi_agent && python automate.py >> logs/cron.log 2>&1

See AUTOMATION.md for complete automation setup guide.

Roadmap

  • Phase 1: ArXiv paper tracking
  • Phase 2: GitHub repository monitoring
    • Known repos tracking
    • Repository discovery via contributor networks
    • Repository search & stats
  • Phase 3: Blog aggregation
    • RSS/Atom feed parsing
    • Multi-source tracking (Anthropic, OpenAI, Alignment Forum, LessWrong)
    • Relevance scoring
  • Phase 4: Automation
    • Automated update script
    • Digest generation
    • Cron setup guide
  • Phase 5: Web Dashboard
    • Flask web interface
    • Dashboard with stats and recent items
    • Papers/repos/blogs pages with search
    • Filter by date, source, popularity
  • Phase 6: Advanced Features
    • Topic clustering and trend analysis
    • Citation tracking
    • Author networks
    • Email notifications

License

MIT

About

An agent to track and analyze mechanistic interpretability of LLM(mainly transformer-curcuits, distill)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published