Skip to content

server : add Speech Recognition & Synthesis to UI#8679

Merged
ngxson merged 2 commits intoggml-org:masterfrom
ElYaiko:master
Jul 25, 2024
Merged

server : add Speech Recognition & Synthesis to UI#8679
ngxson merged 2 commits intoggml-org:masterfrom
ElYaiko:master

Conversation

@ElYaiko
Copy link
Copy Markdown
Contributor

@ElYaiko ElYaiko commented Jul 25, 2024

This PR adds a Speech Recognition & Synthesis to the UI (A simple voice mode).

Screenshot 2024-07-24 at 21-23-02 llama cpp - chat

Features added:
Talk button: Initiates speech-to-text.
Send after talk option: Sends the message after STT.
Voice option: Text-to-speech voice used for the bot.
Play/pause message: Play/pause message with selected TTS voice.
Play message after completition

Tested browsers:

  • Chrome
  • Firefox
  • Safari

Tested OS:

  • Windows
  • macOS
  • Linux (Requires additional packages for TTS: Guide)
  • Android
  • iOS

Copy link
Copy Markdown
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Just need to fix the trailing whitespaces (see the CI)

Copy link
Copy Markdown
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Small detail: I'd prefer to add a small message says "TTS and speech recognition are not provided by llama.cpp", so to be clear to user that the quality depends on their browser, not on llama.cpp or the model itself.

@ElYaiko
Copy link
Copy Markdown
Contributor Author

ElYaiko commented Jul 25, 2024

@ngxson What do you think?

2024-07-25-164752_1366x768_scrot

@ElYaiko ElYaiko requested a review from ngxson July 25, 2024 20:54
@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Jul 25, 2024

Yes it's LGTM, we can merge once the CI pass

@ngxson ngxson merged commit 01aec4a into ggml-org:master Jul 25, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 27, 2024
* server : add Speech Recognition & Synthesis to UI

* server : add Speech Recognition & Synthesis to UI (fixes)
@jboero
Copy link
Copy Markdown
Contributor

jboero commented Aug 15, 2024

Wow I just saw this update. Kudos merging Whisper and TTS this is brilliant. Well done.

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* server : add Speech Recognition & Synthesis to UI

* server : add Speech Recognition & Synthesis to UI (fixes)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants