feat: adding ten-vad wasm to project, configuration options on audio#3822
Open
emmkay440 wants to merge 1 commit into
Open
feat: adding ten-vad wasm to project, configuration options on audio#3822emmkay440 wants to merge 1 commit into
emmkay440 wants to merge 1 commit into
Conversation
|
mk seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adding TEN-VAD voice activity detection gate for local microphone
What and why
This PR adds an optional voice activity detection (VAD) gate applied to the local microphone track before it is published to the SFU. When enabled, audio is silenced during non-speech segments, reducing unwanted noises from being transmitted.
The gate is implemented using TEN-VAD and it runs entirely inside an AudioWorklet on the audio thread, no IPC round-trip to the main thread giving approximately 16 ms detection latency(customizeable).
Before this change there was no client-side VAD gate.
After this change users can opt into a VAD gate from Audio Settings.
The setting is off by default and persisted in localStorage.
Architecture
public/vad/ten_vad.wasm+ten_vad.jsTEN-VAD Emscripten build.src/livekit/TenVadProcessor.worklet.tsAudioWorklet; wraps the WASM synchronously, decimates 48 kHz -> 16 kHz at 3:1, runs VAD every hop (10-16 ms), applies asymmetric gain ramp.src/livekit/TenVadTransformer.tsLiveKitTrackProcessoradapter; fetches and compiles the WASM once (module is cached and reused across restarts), wires the Web Audio graph.src/settings/settings.tsseven newSettingentries for VAD state and parameters.src/settings/SettingsModal.tsxVAD section added to the Audio tab with Disabled / Simple / Advanced radio buttons and the corresponding controls.src/state/CallViewModel/localMember/Publisher.tsapplyTenVad()subscribes to the local mic track and thevadEnabledsetting, attaches or detachesTenVadTransformerreactively, and pushes parameter changes to the live worklet.Testing
node /tmp/yarn4.js backendNotes
Claude helped me with the implementation but going forward with this PR I can do all changes by myself if asked to.
An enhancement to this PR would be adding some sort of voice feedback loop so users can test for themselves.