A blazing-fast, context-aware chatbot built with Python, Streamlit, and Groq-hosted LLMs. Designed for rapid, multi-model experimentation and professional chat UX.
- Multi-model support:
- meta-llama/llama-4-maverick-17b-128e-instruct
- meta-llama/llama-4-scout-17b-16e-instruct
- gemma2-9b-it
- llama-3.1-8b-instant
- Streaming responses for real-time, token-by-token output
- Per-model chat history (switch models, keep your context)
- Temperature adjuster (slider, 0.0–1.5) for creative vs. focused responses
- Download chat as JSON (user + assistant turns)
- Average response time tracker
- Modern, dark-themed UI with avatars, chat bubbles, and sidebar controls
- Clear all chat histories button
- Persistent context (history is saved and restored per model)
- Error handling for API, rate limits, and timeouts
- Clone the repo:
git clone https://github.com/0xnomy/groq_chatbot cd groq_chatbot - Install dependencies:
pip install -r requirements.txt
- Add your Groq API key:
- Create a
.envfile in the project root:GROQ_API_KEY=your-groq-api-key-here
- Create a
- Run the app:
streamlit run app.py
- Select a model from the sidebar (each model has its own chat history)
- Adjust temperature (creativity) with the slider
- Type your message and press Enter
- See responses stream in real time
- Download your conversation as JSON with the download button
- Clear all chat histories with the sidebar button
- Switch models to compare answers or continue previous chats
- meta-llama/llama-4-maverick-17b-128e-instruct – Balanced, strong generalization (Meta)
- meta-llama/llama-4-scout-17b-16e-instruct – Fast, low-latency (Meta)
- gemma2-9b-it – Safe, helpful, instruction-tuned (Google DeepMind)
- llama-3.1-8b-instant – Compact, high-speed (Meta)
- All chat context is managed by the app (not the model)
- Each model has its own chat history (saved as
chat_history_<model>.json) - When you send a message, the full history is sent to the model for context-aware replies
- Switching models loads the last conversation for that model
- Temperature:
- Lower = more focused, deterministic
- Higher = more creative, varied
- Export:
- Download your current conversation as a JSON file (
groqchatv1_history.json)
- Download your current conversation as a JSON file (
- [Groq API] (https://console.groq.com/docs/)