SnapMenu allows you to snap a photo of a restaurant menu, intelligently extract the dish names, and then see what those dishes look like! It uses a Flutter frontend and a Python FastAPI backend powered by Google Cloud Vision API, Google Custom Search, and Gemini for an enhanced user experience.
- Menu OCR: Upload a menu image to extract all visible text.
- AI-Powered Dish Identification: Leverages a Gemini model to intelligently filter OCR'd text and identify actual dish names, excluding headers, descriptions, and other non-dish text.
- Smarter Dish Image Search: Uses Gemini to generate optimized search queries for Google Custom Search to find more relevant images of prepared dishes.
- CORS-Friendly Image Proxy: Backend includes an image proxy to securely fetch and display images from various external sources in the Flutter web app, avoiding common CORS issues.
- Interactive Flutter UI: A user-friendly interface built with Flutter for seamless menu uploads and dish image Browse.
- Frontend: Flutter
- Backend: Python, FastAPI
- AI/ML Services:
- Google Cloud Vision API (for OCR)
- Google Custom Search Engine API (for image lookups)
- Google Gemini (via
google-generativeaiSDK for text processing and query enhancement)
- Backend Development Tools:
python-dotenv,requests,httpx,uvicorn
The backend handles image processing, OCR, AI-based text filtering, and image searching.
-
Navigate to the backend directory:
cd path/to/your/project/adk_backend -
Create and activate a Python virtual environment:
python -m venv .venv # Windows .\.venv\Scripts\activate # macOS/Linux source .venv/bin/activate
-
Install Dependencies: Make sure you have a
requirements.txtfile in youradk_backenddirectory. It should include:fastapi uvicorn[standard] python-dotenv requests google-generativeai # google-cloud-aiplatform (if needed by your ADK version, otherwise google-generativeai is primary for direct Gemini calls) httpx # any other specific google.adk.agents dependencies
Then install:
pip install -r requirements.txt
-
Environment Variables (
.envfile) 🔑: This project requires API keys for Google Cloud services. A.envfile is included in theadk_backenddirectory with placeholder values. You MUST replace these placeholders with your actual API keys.Create or ensure your
.envfile in theadk_backenddirectory looks like this:VISION_API_KEY=YOUR_GOOGLE_VISION_API_KEY_HERE CSE_API_KEY=YOUR_GOOGLE_CUSTOM_SEARCH_API_KEY_HERE CSE_CX=YOUR_CUSTOM_SEARCH_ENGINE_ID_HERE GEMINI_API_KEY=YOUR_GOOGLE_GEMINI_API_KEY_HERE
VISION_API_KEY: For Google Cloud Vision API (OCR).CSE_API_KEY: For Google Custom Search Engine API.CSE_CX: Your Custom Search Engine ID (configure this in the Google CSE control panel to search images).GEMINI_API_KEY: For the Gemini API (used for intelligent filtering and query generation). Ensure this key has the "Generative Language API" or "Vertex AI API" enabled as appropriate for the SDK usage.
-
Run the FastAPI Server:
uvicorn agent:app --reload --host 0.0.0.0 --port 8000
The backend should now be running, typically accessible at
http://127.0.0.1:8000from the host machine.
The Flutter app provides the user interface.
-
Navigate to your Flutter project directory.
-
Environment Variables (
.envfile for Flutter): The Flutter app needs to know the backend's base URL. Create a.envfile in the root of your Flutter project with the following (adjust the URL as needed):API_BASE_URL=[http://127.0.0.1:8000](http://127.0.0.1:8000)
- For Android Emulator (if backend is on host localhost): Use
API_BASE_URL=http://10.0.2.2:8000 - For iOS Simulator (if backend is on host localhost):
API_BASE_URL=http://localhost:8000works. - For physical device testing: Use your computer's LAN IP address (e.g.,
http://192.168.1.X:8000) and ensure your backend Uvicorn server is running on0.0.0.0.
- For Android Emulator (if backend is on host localhost): Use
-
Get Flutter Packages:
flutter pub get
-
Run the App:
flutter run
Select your desired device (mobile emulator/simulator, physical device, or web browser).
- API Keys: The application will not work without valid API keys for Google Cloud services correctly set in the backend's
.envfile. - CORS Configuration: The backend (
agent.py) currently usesallow_origins=["*"]for CORS. This is permissive and suitable for development. For a production environment, you MUST restrictallow_originsto the specific domain(s) of your deployed Flutter application to enhance security. - Custom Search Engine (CSE): Ensure your Google Custom Search Engine (identified by
CSE_CX) is configured to search the entire web for images or specific sites that are good sources for dish photos.
- Launch the Flutter Application: Ensure your backend server is running, then start the Flutter app on your chosen device/emulator or web browser.
- Upload a Menu:
- On the main screen, you'll find options to "Take Photo" of a menu or select an image from your "Gallery".
- After selecting an image, it will be uploaded to the backend for processing.
- View Extracted Dishes:
- The backend will perform OCR and then use Gemini to intelligently filter and identify dish names from the menu.
- The list of identified dish names will be displayed in the app.
- See Dish Images:
- Tap on any dish name from the list.
- The app will request an image for that dish from the backend. The backend uses an LLM-enhanced Google Custom Search query to find a relevant photo.
- The image (or a "preview not available" message if an image can't be found/loaded) will be displayed.
- Navigate: Use the "Back to Menu" button to return from the dish image view to the list of extracted dishes. You can clear the current menu image to upload a new one.
The FastAPI backend exposes the following main endpoints (running on http://127.0.0.1:8000 by default):
-
POST /upload_menu/:- Request:
multipart/form-datawith afilefield containing the menu image. - Response: JSON object with
statusanditems(a list of identified dish names) or an error message.{ "status": "success", // or "success_ocr_only" or "error" "items": ["Spaghetti Carbonara", "Caesar Salad", ...], "message": "Optional message" // e.g., if LLM filtering was skipped }
- Request:
-
POST /get_dish_image/:- Request: JSON body with the dish name.
{ "dish": "Spaghetti Carbonara" } - Response: JSON object with
statusandurl(the URL of the dish image found by Google Custom Search) or an error message.{ "status": "success", // or "error" "url": "[http://example.com/path/to/image.jpg](http://example.com/path/to/image.jpg)" }
- Request: JSON body with the dish name.
-
GET /proxy_image/:- Request: Query parameter
image_urlcontaining the URL-encoded original image URL. Example:/proxy_image/?image_url=https%3A%2F%2Fexternal.com%2Fimage.jpg - Response: Streams the image content directly with the appropriate
Content-Type, or an HTTP error if the image cannot be fetched or is not a valid image type.
- Request: Query parameter
-
RuntimeError: ... API_KEY not set in environment:- Ensure the respective API key (
VISION_API_KEY,CSE_API_KEY,CSE_CX,GEMINI_API_KEY) is correctly set in the.envfile within youradk_backenddirectory. - Make sure
load_dotenv()is called at the very beginning of youragent.py. - Restart your Uvicorn server after modifying the
.envfile.
- Ensure the respective API key (
-
Flutter app can't connect to the backend /
API_BASE_URLissues:- If using an Android Emulator, ensure
API_BASE_URLin the Flutter app's.envfile is set tohttp://10.0.2.2:8000(if your backend is onlocalhost:8000). - If using a physical device, ensure your backend Uvicorn server is running on
0.0.0.0(e.g.,uvicorn agent:app --host 0.0.0.0 --port 8000) and your Flutter app uses your computer's LAN IP address forAPI_BASE_URL. - Check for firewall issues that might be blocking connections.
- If using an Android Emulator, ensure
-
CORS errors in Flutter Web:
- Ensure
CORSMiddlewareis correctly configured inagent.pyin your FastAPI backend. For development,allow_origins=["*"]is common, but for production, specify your Flutter web app's actual domain(s). - The image proxy (
/proxy_image/) is designed to solve CORS for images from external sites. If you get CORS errors for your own backend endpoints, it's theCORSMiddlewareon your FastAPI app that needs checking.
- Ensure
-
Irrelevant dish images / No image found:
- The quality of image search depends on the Google Custom Search Engine and the queries generated by Gemini.
- You can experiment with the
query_generation_promptwithin thesearch_imagefunction inagent.pyto try and improve query effectiveness. - Ensure your
CSE_CX(Custom Search Engine ID) is configured correctly and is set to search the web for images. - Some dishes may genuinely have few good, publicly available images.
-
415 Unsupported Media Typefor some images (from/proxy_image/):- This is expected behavior if the URL returned by Google Custom Search points to an HTML page (like some Instagram links) instead of a direct image file. The proxy correctly identifies this and stops, preventing an error in your Flutter app. The Flutter app's
errorWidgetforCachedNetworkImageshould handle this gracefully.
- This is expected behavior if the URL returned by Google Custom Search points to an HTML page (like some Instagram links) instead of a direct image file. The proxy correctly identifies this and stops, preventing an error in your Flutter app. The Flutter app's
- Improved Dish Image Selection: Fetch multiple image results from CSE and use image analysis or a multimodal LLM to select the best, most representative "prepared dish" photo.
- User Feedback for Images: Allow users to report if an image is incorrect.
- Caching: Implement caching for OCR results or image search results in the backend to reduce API calls and latency for frequently accessed menus/dishes.
- More Detailed Dish Information: Extend to fetch dish descriptions, ingredients, or even nutritional information.
- UI/UX Polish: Add animations, improved loading states, and more refined UI elements in the Flutter app.
- Localization/Multiple Languages: Support for menus in different languages.
- Price Extraction: Attempt to extract and display prices alongside dish names.
- User Accounts & Saved Menus: Allow users to save processed menus.
- Email: You can reach me at
[me@desouki.com]. - GitHub Profile: Desouki27 (You can also contact me via my GitHub profile).
- LinkedIn: Mohamed Desouki