FileFolio helps privacy-conscious professionals keep large PDF collections searchable and organized using local AI. No cloud, no telemetry, all on your machine.
Status: Actively maintained, used on my own 1,000+ PDF collection. Expect breaking changes before v1.0, but I'm responsive to issues and feedback.
- You have hundreds of PDF bills, reports, or research papers scattered in folders.
- You care about privacy and do not want to upload them to cloud AI services.
- You still want smart search, auto-tagging, and reasonable file names.
FileFolio watches a folder, uses a local LLM via Ollama to analyze each PDF, and keeps everything searchable in one interface.
- Automatic organization – watches a folder and imports new PDFs, extracting text (with OCR), then generating categories and tags
- Privacy-first – all processing happens locally with Ollama, no cloud services, no telemetry or analytics
- Fast retrieval – full-text search across content and metadata, plus thumbnail previews
- Disaster-proof – backup and restore your entire library via ZIP
- Multi-language support – UI available in multiple languages
- Dark mode – toggle between light and dark themes
- Python 3.10+
- Ollama installed locally
- Poppler (for PDF processing)
- macOS:
brew install poppler - Ubuntu/Debian:
apt-get install poppler-utils - Windows: Download from poppler releases
- macOS:
- Tesseract (for OCR on scanned documents)
- macOS:
brew install tesseract - Ubuntu/Debian:
apt-get install tesseract-ocr - Windows: Download from Tesseract releases
- macOS:
- Clone the repository
git clone https://github.com/imkrishsub/filefolio.git
cd filefolio- Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txt- Start Ollama (in a separate terminal)
ollama serve- Run the application
python backend/main.py- Open your browser Navigate to: http://127.0.0.1:8000
Set a custom port using the PORT environment variable:
PORT=8080 python backend/main.pypytestFull API and functionality coverage including unit tests, integration tests, and frontend tests.
filefolio/
├── backend/
│ ├── main.py # FastAPI server
│ └── sync_service.py # Folder sync service
├── frontend/
│ ├── static/
│ │ ├── app.js # Frontend JavaScript
│ │ ├── style.css # Styles
│ │ └── i18n.json # Translations
│ └── templates/
│ └── index.html # Main interface
├── tests/ # Test suite
├── uploads/ # PDF storage (created on first run)
├── thumbnails/ # Document thumbnails (created on first run)
├── data/ # Database (created on first run)
├── setup.cfg # Linting and tool configuration
├── pytest.ini # Test configuration
└── requirements.txt
- Upload - Drag and drop a PDF file into the web interface, or sync a local folder to automatically import new files
- Extract - Text is extracted from the PDF (with OCR fallback for scanned documents)
- Analyze - A local LLM analyzes the content to determine category, tags, and suggest a filename
- Organize - The document is saved with metadata in a local SQLite database
- Search - Find documents by content, category, tags, or filename
- Backend: FastAPI (Python)
- Frontend: Vanilla JavaScript
- Database: SQLite
- AI/LLM: Ollama
- PDF Processing: PyPDF, pdf2image, pytesseract
- Styling: Custom CSS
Contributions are welcome! Please feel free to submit a pull request or open an issue.
MIT License - see LICENSE file for details.
