A sophisticated AI agent featuring multiple LLM provider support (Google Gemini, OpenAI, Groq-DeepSeek), optimized memory management, and real-time streaming capabilities through a Flask API. The agent specializes in providing city information including weather, time, and facts.
- Multi-LLM Support: Support for Google Gemini, OpenAI, and Groq-DeepSeek models
- LLM Factory Pattern: Easy switching between different LLM providers
- City Information Tools: Weather, time, facts, and city visit planning tools
- Real-time Streaming: Token-by-token streaming responses via Server-Sent Events (SSE)
- Session Management: Multiple conversation sessions with isolated memory
- Memory Optimization: Automatic memory cleanup to prevent memory bloat
- Dynamic Provider Switching: Switch between LLM providers at runtime
- Web Interface: Modern React-based frontend for interaction
- API Endpoints: RESTful API for provider management and chat
- LLM Factory: Factory pattern for creating different LLM providers
- TripAgent: Main agent class with memory management
- Multi-Provider Support: Google Gemini, OpenAI, and Groq-DeepSeek integration
- Flask API: RESTful API with streaming endpoints
- Session Management: Thread-safe session handling
- City Information Tools: Specialized tools for city-related queries
- Session-based Memory: Each conversation session has isolated memory
- Automatic Optimization: Keeps only recent messages (configurable limit)
- Thread-safe Operations: Concurrent session handling with locks
- Memory Status Monitoring: Real-time memory usage tracking
- Python 3.8+
- At least one LLM provider API key:
- Google API Key (for Gemini access)
- OpenAI API Key (for GPT models)
- Groq API Key (for DeepSeek models)
- Modern web browser (for the client interface)
-
Clone or navigate to the project directory:
cd /Users/davidbong/Documents/agentic_projects/memory_agentic -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
cp .env.example .env
Edit
.envand add your API keys (at least one is required):# LLM Provider Configuration GOOGLE_API_KEY=your_google_api_key_here OPENAI_API_KEY=your_openai_api_key_here GROQ_API_KEY=your_groq_api_key_here # Default provider (google_gemini, openai, or groq) DEFAULT_LLM_PROVIDER=google_gemini # Model configurations GOOGLE_MODEL=gemini-1.5-flash OPENAI_MODEL=gpt-3.5-turbo GROQ_MODEL=deepseek-r1-distill-llama-70b
- Go to Google AI Studio
- Create a new API key
- Copy the key and add it to your
.envfile
- Go to OpenAI Platform
- Create a new API key
- Copy the key and add it to your
.envfile
- Go to Groq Console
- Create a new API key
- Copy the key and add it to your
.envfile
python app.pyThe server will start on http://localhost:5000
- Open
client.htmlin your web browser - The interface will automatically connect to the server
- Start chatting with the AI agent
POST /chat
Content-Type: application/json
{
"message": "What's the weather like in Paris?",
"session_id": "user123"
}GET /llm/providersPOST /llm/switch
Content-Type: application/json
{
"provider": "openai"
}GET /memory/status/{session_id}DELETE /memory/clear/{session_id}GET /healthYou can adjust memory settings in app.py:
# Maximum messages to keep in memory per session
self._optimize_memory(session_id, max_messages=20)self.llm = ChatGoogleGenerativeAI(
model="gemini-pro",
temperature=0.7, # Adjust creativity
streaming=True
)The agent comes with built-in tools:
- Search Tool: Simulated web search functionality
- Calculator Tool: Mathematical calculations
- Memory Info Tool: Conversation memory statistics
Extend the _create_tools() method in MemoryOptimizedAgent:
def custom_tool(input_text: str) -> str:
"""Your custom tool implementation"""
return f"Custom result for: {input_text}"
tools.append(Tool(
name="custom_tool",
description="Description of what your tool does",
func=custom_tool
))- Each session has its own
ChatMessageHistory - Sessions are identified by unique session IDs
- Memory is isolated between different users/sessions
- Prevents memory bloat by limiting message history
- Configurable message limits per session
- Maintains conversation context while managing resources
- Uses threading locks for concurrent access
- Safe for multiple simultaneous users
- Session data integrity guaranteed
Monitor memory usage through the API:
curl http://localhost:5000/memory/status/your_session_idThe Flask app runs in debug mode and provides detailed logs:
- Agent reasoning steps
- Tool executions
- Memory operations
- API requests
-
"Google API Key not found"
- Ensure your
.envfile contains the correct API key - Verify the key is valid and has Gemini API access
- Ensure your
-
"Connection refused"
- Make sure the Flask server is running
- Check if port 5000 is available
-
"Streaming not working"
- Ensure your browser supports Server-Sent Events
- Check browser console for JavaScript errors
-
"Memory not persisting"
- Verify session IDs are consistent
- Check server logs for memory optimization triggers
- Adjust
max_messagesbased on your memory requirements - Use shorter session IDs for better performance
- Monitor memory usage in production environments
- Consider implementing persistent storage for long-term memory
- Persistent memory storage (Redis/Database)
- Advanced memory summarization
- Custom tool marketplace
- Multi-modal capabilities
- Advanced session analytics
- WebSocket support for real-time updates
This project is open source and available under the MIT License.
Contributions are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.
Built with โค๏ธ using LangChain, Google Gemini, and Flask# ๐๏ธ Trip Advisor - AI Agent
An intelligent AI agent that helps users gather factual information about cities worldwide. This assistant demonstrates advanced agentic capabilities including tool orchestration, function calling, contextual dialogue handling, streaming API interface, and transparent reasoning.
- ๐ค๏ธ Weather Information: Get current weather conditions for any city
- ๐ Local Time: Check the current time in different cities worldwide
- ๐ City Facts: Learn about city demographics, location, and interesting facts
- ๐บ๏ธ Visit Planning: Comprehensive city visit planning using multiple tools
- Tool Orchestration: Seamless integration of multiple specialized tools
- Function Calling: Structured output with reasoning transparency
- Multi-turn Dialogue: Context-aware conversations with memory
- Streaming API: Real-time response streaming
- Transparent Reasoning: See the agent's thinking process
- LangChain React Agent: Advanced reasoning and tool orchestration
- Google Gemini Integration: Powered by Gemini-1.5-flash model
- Memory Management: Persistent conversation history
- RESTful API: Clean endpoints for frontend integration
- Modern UI: Beautiful, responsive interface
- Real-time Chat: Streaming responses with typing indicators
- Session Management: Multiple conversation sessions
- Memory Visualization: View conversation history and memory status
- Python 3.8+
- Node.js 18+
- Google API Key (for Gemini)
-
Clone and navigate to the project:
git clone https://github.com/bitlabsdevteam/memory_agent.git cd memory_agent -
Set up environment:
cp .env.example .env # Edit .env and add your GOOGLE_API_KEY -
Install dependencies and start:
chmod +x start.sh ./start.sh
The backend will be available at
http://localhost:5001
-
Navigate to frontend directory:
cd frontend -
Install dependencies:
npm install
-
Start development server:
npm run dev
The frontend will be available at
http://localhost:3000
The assistant implements the following specialized tools:
| Tool | Purpose | Implementation |
|---|---|---|
| WeatherTool | Get current weather for a city | Mock data (production: OpenWeatherMap API) |
| TimeTool | Get current time in a city | Timezone calculations with UTC offsets |
| CityFactsTool | Get basic facts about a city | Mock data (production: GeoDB Cities API) |
| PlanMyCityVisitTool | Composite tool for visit planning | Orchestrates multiple tools with reasoning |
User: "What's the weather like in Paris?"
Assistant: Uses WeatherTool โ "Current weather in Paris: 23ยฐC, clear skies, humidity 65%"
User: "Plan my visit to Tokyo"
Assistant: Uses PlanMyCityVisitTool โ Orchestrates multiple tools:
1. Gets city facts about Tokyo
2. Fetches current weather
3. Checks local time
4. Provides comprehensive visit summary
User: "What about the weather there?"
Assistant: Uses conversation context โ Provides weather for previously mentioned city
memory_agent/
โโโ app.py # Main Flask application
โโโ config.py # Configuration management
โโโ requirements.txt # Python dependencies
โโโ start.sh # Startup script
โโโ .env.example # Environment template
โโโ frontend/ # React/Next.js frontend
โ โโโ src/app/
โ โ โโโ components/ # React components
โ โ โโโ page.tsx # Main page
โ โ โโโ layout.tsx # App layout
โ โโโ package.json # Node dependencies
โโโ README.md # This file
Key configuration options in config.py:
- GOOGLE_MODEL: AI model (default: gemini-1.5-flash)
- AGENT_TEMPERATURE: Response creativity (0.0-1.0)
- MEMORY_MAX_MESSAGES: Conversation memory limit
- STREAMING_ENABLED: Real-time response streaming
- FLASK_PORT: Backend server port
GET /health- Health checkPOST /chat- Main chat endpoint (streaming)GET /memory/status/<session_id>- Memory statusDELETE /memory/clear/<session_id>- Clear memory
Test the assistant with various queries:
- Weather queries: "What's the weather in London?"
- Time queries: "What time is it in Sydney?"
- City facts: "Tell me about New York"
- Planning: "Plan my visit to Berlin"
- Follow-ups: "What about the weather there?"
For production use:
-
Replace mock data with real APIs:
- OpenWeatherMap for weather
- World Time API for time zones
- GeoDB Cities API for city facts
-
Environment setup:
- Use production WSGI server (gunicorn)
- Set up proper environment variables
- Configure CORS for your domain
-
Frontend deployment:
- Build optimized bundle:
npm run build - Deploy to Vercel, Netlify, or similar
- Build optimized bundle:
Contributions are welcome! Please feel free to submit a Pull Request.
This project is open source and available under the MIT License.