A heuristic based tool that tries to figure out whether a chunk of code was written by AI or a proper human. It's not perfect, nothing is, but it picks up on the sort of tells that make you go "hmm, that's a bit sus" when reviewing a PR.
- Comment style flags overly verbose comments that just restate the code, hedging language like "you may want to consider", and suspiciously long explanations of obvious things
- Docstring saturation every tiny helper function having a perfectly formatted docstring with Args/Returns/params? Yeah, humans don't do that
- Naming AI loves its
userAuthenticationTokenanddatabaseConnectionString. Real devs useauth_tokanddb_connbecause life's too short - Error handling catches the classic overdefensive pattern where every possible edge case is guarded against, plus empty
except: pass # handle this appropriatelyblocks - Uniformity if every function in a file looks structurally identical, that's a red flag. Humans have moods, AI doesn't
- Dead giveaways
# TODO: implement thissitting right above a fully working implementation, and other classics
- Python (most thorough)
- JavaScript
- Ruby
- Go
- Java
Python and JS have the best coverage across all checks. Go and Java are a bit thinner on the error handling side, contributions welcome innit.
You'll need Python 3.10+ (probably works on 3.8 but haven't tested).
git clone <this repo>
cd ai-code-detector
python -m venv venv
source venv/bin/activate
pip install pytestDead simple, just pass it a string of code and tell it what language:
from src.detector import scan
code = open("some_file.py").read()
report = scan(code, lang="python")
print(report.score) # 0.0 to 1.0, higher = more likely AI
print(report.breakdown) # per category scoresThe scan function returns an AiReport with:
scoreoverall likelihood (0 to 1) that the code's AI generatedbreakdowndict of individual category scores so you can see what tripped it
source venv/bin/activate
python -m pytest tests/ -vIt's all regex and heuristics, no ML, no API calls, runs entirely offline. Each checker looks at different aspects of the code and returns a score between 0 and 1. These get combined with a weighted average, plus a boost when multiple signals fire at once (because if the comments AND the naming AND the docstrings all look AI generated, that's way more damning than just one of them).
The thresholds have been tuned against a bunch of test cases but they're definitely not gospel. False positives will happen, especially with devs who write very clean code. Don't go accusing your colleagues based solely on this tool yeah?
- It's heuristics, not magic. A careful human editing AI output will fool it
- Short snippets don't give it much to work with
- Language support varies, Python gets the most love
- It doesn't know about your project's conventions (yet), so it can't catch the "uses camelCase when the codebase uses snake_case" tell
- Someone who genuinely writes pristine code will get false positives. Sorry about that
Got a few ideas for where this is headed over the next couple of weeks:
- A proper web based version so you don't have to clone the repo and faff about in a terminal
- GitHub integration so you can point it at a repo and let it chew through the code
- Zip upload, drag a zipped project onto the page and get back a report
- Looking at git history as part of the assesment, because sudden changes in commit style or a dev going from scrappy commits to pristine conventional ones is itself a tell
- Proper reporting, maybe per file breakdowns so you can see which bits are suss
- More langauges, TypeScript, Rust, PHP, Kotlin, whatever people actually write these days
No promises on timelines, this is a side project and I've got a day job, but the rough plan is to get the web version going first and then layer the fancier stuff on top.
Look, this tool is not the final word on anything. It's a bunch of regex and heuristics and it will absolutely get things wrong, both ways. It would not hold up in court, it would not hold up in an academic misconduct hearing, and it probably shouldnt be the only thing you lean on when accusing someone of passing off AI code as their own. Use it as a nudge, a "maybe have a closer look at this PR", not as proof of anything.
That said, who knows where this little project takes us. Starts as a weekend hack, next thing you know its a SaaS. Stranger things have happened.
Do whatever you want with it mate.