🎤 A one-PC Python agent that helps you handle AI-based interviews.
It listens to the interviewer, transcribes with Whisper, generates concise English answers with GPT, and shows them in a floating always-on-top window.
✨ Features:
- Whisper STT + GPT answers in real time
- Always-on-top floating GUI
- VU meter (text + graphical) to confirm audio input
- Translation (transcript / answer / both) with hotkeys
- Chat timeline pane with save/clear options
- Optional TTS to hear answers in your ear
- Pro hotkeys: STAR mode, Concise vs Elaborate, copy, transparency, etc.
- Python 3.10+
- An OpenAI API key set as an env var:
- Windows (PowerShell):
setx OPENAI_API_KEY "sk-..."then restart terminal - macOS/Linux (bash):
export OPENAI_API_KEY="sk-..."
- Windows (PowerShell):
pip install -r requirements.txt- Windows: Install VB-CABLE or use Voicemeeter Banana. Send your browser/app output (AI’s voice) to CABLE Input, and select CABLE Output as the agent input.
- macOS: Install BlackHole (2ch). Route browser output into BlackHole, set agent to record from it.
💡 Tip: Use headphones to avoid feedback.
python live_interview_agent_v5.py --device-index <index_of_CABLE_Output>- Keep the floating window visible.
- Read/paraphrase the on-screen answer naturally.
- The interviewer hears only your mic; the agent listens through the virtual cable.
- F1: Toggle TTS (on/off)
- F2: Clear live fields
- F3: Pause/resume listening
- F4: Copy last answer to clipboard
- F5: Toggle STAR mode (Situation → Action → Result)
- F6: Toggle Concise vs Elaborate answers
- F7/F8: Adjust window transparency
- F9: Cycle translation mode (off / transcript / answer / both)
- F10: Change target language code
- F11: Save timeline to file (txt)
- F12: Clear timeline pane
- ESC: Quit
AGENT_CHUNK_SECONDS(default2.5) — audio slice lengthAGENT_SAMPLE_RATE(default16000) — capture rateAGENT_SILENCE_THRESHOLD(default0.003) — RMS gate for silenceAGENT_STT_MODEL(defaultwhisper-1)AGENT_GPT_MODEL(defaultgpt-4o-mini)AGENT_SHOW_TRANSCRIPT(1to display recognized question)AGENT_TTS_DEFAULT(1to start with TTS enabled)AGENT_TRANSLATE(off|transcript|answer|both)AGENT_TARGET_LANG(defaultes)
Example (Windows PowerShell):
setx AGENT_SHOW_TRANSCRIPT "1"
setx AGENT_TRANSLATE "both"
setx AGENT_TARGET_LANG "fr"- No VU activity → check that browser/app output is routed to CABLE Input.
- VU moves but no transcript → raise volume or lower
AGENT_SILENCE_THRESHOLD. - ImportError →
pip install -r requirements.txt - TTS issues → disable with F1 or set
AGENT_TTS_DEFAULT=0.
This tool is meant to support comprehension and confidence. Use responsibly and check the rules of your interview platform.