[Feature]: optional TTS replies for voice/audio prompts

### Problem

The bot already supports voice/audio input via STT, which is great for mobile use.
A natural follow-up for that workflow is optional TTS output: when a user ~~sends a Telegram voice/audio message~~ toggle TTS on with `/tts`, the bot could return the normal text reply plus an audio rendering of that same final assistant response.
This would improve hands-free/mobile usability without changing the normal text-first workflow. (idea from OpenClaw).

### Proposal

- Keep normal text prompts exactly as they are today: text reply only
- For Telegram `voice` / `audio` input:
  - transcribe with the existing STT flow
  - send the normal final text response
  - optionally send a TTS audio file of that exact final assistant text
- ~~Make it opt-in via env config~~ make a `/tts` toggle, disabled by default
## Proposed config
Something like:
~~- `TTS_ENABLED=false`~~ use `/tts` toggle instead
- `TTS_API_URL=` (fallback to `STT_API_URL` if unset)
- `TTS_API_KEY=` (fallback to `STT_API_KEY` if unset)
- `TTS_MODEL=gpt-4o-mini-tts`
- `TTS_VOICE=alloy`
## Scope / guardrails
To keep this small and low-risk:
- no change for text-origin prompts
- no streaming spoken output
- just one final audio file after the normal text reply
## Why this seems aligned
This stays within the current single-chat / predictable interaction model in `CONCEPT.md`:
- it does not add parallelism or group-specific behavior
- it only extends the existing voice-input path
- it remains optional and disabled by default
## Implementation notes
I already prototyped this locally my fork and it was pretty contained:
- small TTS client modeled after the existing STT client
- lightweight tracking so only audio-origin prompts trigger TTS
- hook into the final assistant completion path after the normal text reply
- docs + tests included

### Done criteria (optional)

- When sending a voice memo/file, ~~and .env has it enabled~~ and enabled via `/tts`, bot responds with text output and then an audio file of that text
~~- When sending a text input, bot always replies as before with text regardless of configuration.~~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: optional TTS replies for voice/audio prompts #63

Problem

Proposal

Proposed config

Scope / guardrails

Why this seems aligned

Implementation notes

Done criteria (optional)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature]: optional TTS replies for voice/audio prompts #63

Description

Problem

Proposal

Proposed config

Scope / guardrails

Why this seems aligned

Implementation notes

Done criteria (optional)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions