An AI-powered learning resource discovery system that maps homework questions to curated educational resources through an interactive visual network.
Built for HackHoya 2026 🏆
This app does NOT solve homework — it helps students find the right resources to learn from.
- ✅ Recommends external learning resources (Khan Academy, MIT OCW, 3Blue1Brown, etc.)
- ✅ Discovers prerequisite learning paths using vector geometry
- ✅ Automatically clusters related questions semantically
- ✅ Supports photo upload with OCR (handwriting supported!)
- ✅ Visualizes question-resource relationships as an interactive graph
- ✅ Saves search history for logged-in users
- ❌ Does NOT generate answers or step-by-step solutions
- ❌ Does NOT act as a tutor or chatbot
Uses vector geometry on concept embeddings to automatically infer prerequisite relationships. No manual knowledge engineering required - the system discovers that "algebra → functions → derivatives" by analyzing the geometric structure of embedding space.
Upload a photo of handwritten or typed homework and watch as Gemini Vision extracts the text automatically. Works with equations, diagrams, and messy handwriting.
Automatically groups related questions into semantic clusters with AI-generated names and color-coded visual regions in the graph.
Unlike static resource databases, this searches the entire web in real-time. Works for ANY topic - from quantum physics to React hooks. The system finds resources, ranks them intelligently, then applies the smart organization features above.
See WEB_SEARCH_ARCHITECTURE.md for technical details.
- Ingestion - Parse and split homework into individual questions
- Concept Extraction - Use LLM to identify key concepts (not answers!)
- Question Embedding - Generate vector representations
- Learning Path Discovery - Find prerequisites using embedding geometry (NEW!)
- Resource Retrieval - Find candidate resources via vector similarity
- Ranking - Deterministic scoring based on:
- Embedding similarity (40%)
- Concept overlap (30%)
- Level appropriateness (20%)
- Format diversity (10%)
- Explanation - LLM generates 2-3 sentences on why each resource fits
- Clustering - Group related questions semantically (NEW!)
- Framework: Next.js 14 (App Router)
- Language: TypeScript
- Styling: Tailwind CSS
- UI: Custom SpiderWeb graph visualization with cluster regions
- Vector Search: In-memory cosine similarity
- Embeddings: Google
text-embedding-004 - LLM: Google Gemini
gemini-1.5-flash(text + vision) - Auth & Database: Supabase
- Knowledge Graph: Auto-constructed from resource corpus
- Node.js 18+
- Google Gemini API key (Get one here)
- Supabase project (optional, for auth & history)
- Clone and navigate to the project:
cd HackHoya- Install dependencies:
npm install- Create
.env.localfrom the template:
cp env.template .env.local- Fill in your API keys in
.env.local:
GEMINI_API_KEY=your-gemini-api-key
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co # Optional
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key # Optional- Run the development server:
npm run devSee TESTING_GUIDE.md for detailed testing instructions.
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
5. Run the development server:
```bash
npm run dev
On the first API call, the system will:
- Load 27 seed resources (Khan Academy, MIT OCW, 3Blue1Brown, etc.)
- Generate embeddings for all resources (~30 seconds)
- Cache everything in memory
Subsequent requests will be much faster.
homework-router/
├── app/
│ ├── api/process/route.ts # Main API endpoint
│ ├── layout.tsx # Root layout
│ ├── page.tsx # Home page with upload + results
│ └── globals.css # Global styles
├── lib/
│ ├── types.ts # TypeScript types
│ ├── corpus.ts # Resource corpus manager
│ ├── llm/
│ │ └── client.ts # Abstracted LLM client
│ ├── embeddings/
│ │ └── utils.ts # Vector operations
│ ├── data/
│ │ └── resources.ts # Seed resource corpus (27 items)
│ └── pipeline/
│ ├── index.ts # Main orchestrator
│ ├── ingest.ts # Question splitting
│ ├── concepts.ts # LLM concept extraction
│ ├── retrieval.ts # Vector search
│ ├── ranking.ts # Deterministic scoring
│ └── explain.ts # LLM explanation generation
├── public/
│ └── example-homework.txt # Demo homework
└── package.json
Try pasting this example:
1. Find the derivative of f(x) = 3x² + 2x - 5
2. Explain the difference between kinetic and potential energy
3. What is the time complexity of binary search?
Expected output:
- 3 questions detected
- Concepts extracted for each (e.g., "derivatives", "power rule", "kinematic energy")
- Top 5 resources per question with relevance scores
- Explanations focus on what the resource teaches, NOT how to solve
The current UI is minimal and functional. For your spiderweb navigation concept:
- Question results are currently linear cards
- Consider: radial/graph layout with questions as nodes
- Shared concepts could link questions visually
- Resources could branch out from each question node
Suggested libraries for spiderweb UI:
react-force-graphfor force-directed graphsvis-networkfor interactive network visualizationd3.jsfor custom graph rendering
Edit lib/data/resources.ts and add entries:
{
id: "unique-id",
title: "Resource Title",
url: "https://...",
format: "video" | "text" | "textbook",
level: "intro" | "intermediate" | "advanced",
concepts: ["concept1", "concept2"],
description: "Brief description",
}Restart the server to re-generate embeddings.
Edit lib/pipeline/ranking.ts:
// Current weights
score += candidate.similarityScore * 0.4; // Embedding similarity
score += overlap * 0.3; // Concept overlap
score += levelScore * 0.2; // Level match
// + 0.1 format diversity bonusThe LLMClient class in lib/llm/client.ts is abstracted and currently uses Gemini. To add OpenAI:
- Install
openaipackage - Implement
openaiChat()andopenaiEmbed()methods - Update constructor to handle
provider: "openai"
Request:
{
"content": "1. What is...?\n2. Explain...",
"contentType": "text"
}Response:
{
"questions": [
{
"questionText": "What is...?",
"detectedConcepts": ["concept1", "concept2"],
"questionType": "conceptual",
"resources": [
{
"resource": { "id": "...", "title": "...", ... },
"score": 0.92,
"reason": "This resource covers..."
}
]
}
]
}| Constraint | How Enforced |
|---|---|
| No homework solutions | LLM system prompts explicitly forbid solutions |
| Ranking before explanation | Pipeline order is hardcoded: retrieve → rank → explain |
| Deterministic ranking | Pure function with explicit weights (no LLM) |
| Resource-focused output | UI shows only resources, no answer input fields |
MIT License - feel free to use for your hackathon!
This is a hackathon-scoped project. For production use, consider:
- Adding a vector database (Pinecone, pgvector) for larger corpora
- Caching LLM responses to reduce costs
- Rate limiting and error handling
- User feedback on resource quality
- Mobile-responsive improvements
- Share/export functionality