call-assistant

High level architecture:

    Incoming Call
            ↓
    CallScreeningService (Android)
            ↓ (15 sec timer)
    If user doesn't answer
            ↓
    Auto-answer call
            ↓
    Audio Stream (Mic + Speaker)
            ↓
    Speech-to-Text (Local)
            ↓
    Local LLM (Response generation)
            ↓
    Text-to-Speech (Local)
            ↓
    Caller hears AI voice

File Structure:

            call-assistant/
            │
            ├── assistant_core.py        # Entry point (Android will call this)
            ├── conversation.py          # Call flow + state machine
            ├── llm.py                   # Local LLM (TinyLlama / Phi)
            ├── stt_whisper_stream.py    # Streaming Whisper STT (optimized)
            ├── memory.py                # Call transcript storage
            │
            ├── models/
            │   ├── tinyllama.gguf       # LLM model, this needs to be downloaded locally | because this is llm model which is really has large fiel size, so stop complaning and read this 😐 
            │   └── whisper/             # Whisper models | download this using `git clone https://huggingface.co/Systran/faster-whisper-base`
            │
            └── requirements.txt

When an incoming call arrives, it is first intercepted by the CallScreeningService on Android. A 15‑second timer starts, giving the user a chance to answer normally.

If the user does not pick up within that window, the system auto‑answers the call. At that point, the phone begins streaming audio from both the microphone and speaker.

The audio is processed through a local speech‑to‑text engine, which converts the caller’s spoken words into text. That text is then passed to a local language model (LLM), which generates an appropriate response.

The response is converted back into audio using local text‑to‑speech, and the caller hears the AI’s synthesized voice speaking on behalf of the user. and there is a change in architecture in which the whole application will be converted into an Android app using kivy... the LLM model will be changed to quen 0.5 billion parameter model and the overall architecture will be changed soon

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
__pycache__		__pycache__
call_history		call_history
models/whisper/base		models/whisper/base
.gitignore		.gitignore
README.md		README.md
assistant_core.py		assistant_core.py
conversation.py		conversation.py
history_store.py		history_store.py
llm.py		llm.py
llm_name_extractor.py		llm_name_extractor.py
main.py		main.py
memory.py		memory.py
name_extractor.py		name_extractor.py
requirements.txt		requirements.txt
stt_whisper_stream.py		stt_whisper_stream.py
summary.py		summary.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

call-assistant

High level architecture:

File Structure:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

call-assistant

High level architecture:

File Structure:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages