"You clicked 'I agree.' We actually read it."
PrivyScan is an AI-powered privacy policy analyzer that summarizes lengthy privacy policies, labels important sections, and rates them from A to E based on privacy friendliness — helping users understand what they are agreeing to before clicking accept.
PrivyScan simplifies complex privacy policies using AI and Machine Learning. Users can enter a website URL, and the system automatically fetches, summarizes, categorizes, and rates the privacy policy. The goal is to make online privacy information transparent, accessible, and easy to understand.
| Task | Model Used |
|---|---|
| Summarization | BART |
| Policy Labelling / Classification | TF-IDF + Logistic Regression |
| Privacy Rating | LegalBERT |
| Dataset | Description |
|---|---|
| OPP-115 | Annotated dataset of website privacy policies categorized by privacy practices and data usage. |
| ToS;DR | Community-driven dataset that reviews and rates terms of service and privacy policies. |
- Privacy policy extraction
- Text cleaning
- Chunking large policies into manageable sections
- BART generates simplified summaries for each chunk
- TF-IDF + Logistic Regression classifies chunks into policy categories
- LegalBERT assigns privacy ratings from A–E based on privacy practices
- All models are merged into a single end-to-end processing pipeline
| Component | Platform |
|---|---|
| Frontend | Vercel |
| Backend API | Render |
| ML Inference | Hugging Face Spaces |
| Version Control | GitHub |
| Containerization | Docker |
~ by Team Spaghetti 🍝