Skip to content

Claude/slack update stt assembly to inworld jx5y b#56

Open
cherylafichter wants to merge 4 commits into
mainfrom
claude/slack-update-stt-assembly-to-inworld-jx5yB
Open

Claude/slack update stt assembly to inworld jx5y b#56
cherylafichter wants to merge 4 commits into
mainfrom
claude/slack-update-stt-assembly-to-inworld-jx5yB

Conversation

@cherylafichter
Copy link
Copy Markdown
Contributor

No description provided.

claude added 4 commits March 11, 2026 21:58
- Create InworldSTTNode using Inworld's REST API (POST /stt/v1/transcribe)
  with energy-based VAD for end-of-turn detection
- Remove assembly-ai-stt-ws-node.ts and its WebSocket-based streaming logic
- Update ConversationGraphWrapper to hold inworldSTTNode reference
- Update ConversationGraphConfig to accept inworldApiKey (was assemblyAIApiKey)
- Replace ASSEMBLY_AI_API_KEY env var with INWORLD_API_KEY in graph-service,
  .env.example, and render.yaml (single key for all Inworld services)
- Replace AssemblyAI turn-detection presets in server config with equivalent
  Inworld STT VAD presets (silenceThresholdMs / minSpeechMs / energyThreshold)
- Rename ASSEMBLY_AI_EAGERNESS env var to INWORLD_STT_EAGERNESS
- Update comments in connection-manager, transcript-extractor-node, server, and
  audio-processor.js to reflect the new STT provider

https://claude.ai/code/session_01EDqcCeQHNj2f2TVeFb5Dxh
- Fix prettier line-length formatting for Buffer.from() call in inworld-stt-node.ts
- Remove unused samplesPerMs variable in inworld-stt-node.ts
- Fix prettier line-length formatting for description string in server.ts

https://claude.ai/code/session_01EDqcCeQHNj2f2TVeFb5Dxh
- Add 30s timeout protection to callInworldSTT to prevent network stalls
- Fix session lifecycle: remove shared STT node destruction from
  ConnectionManager.destroy() since the node is shared across sessions

https://claude.ai/code/session_01TJxsn7u4AVgj7UHWPffSha
@cherylafichter cherylafichter requested a review from a team as a code owner March 12, 2026 17:47
Copilot AI review requested due to automatic review settings March 12, 2026 17:47
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 12, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the backend speech-to-text path from an AssemblyAI streaming WebSocket node to a new Inworld STT node that uses energy-based VAD and an Inworld STT REST call, and updates configuration/env wiring accordingly.

Changes:

  • Replaced AssemblyAISTTWebSocketNode with a new InworldSTTNode implementation and rewired the conversation graph to use it.
  • Removed AssemblyAI environment/config plumbing and updated logging/comments to reflect Inworld STT usage.
  • Updated deployment/env examples to drop ASSEMBLY_AI_API_KEY and introduce INWORLD_STT_EAGERNESS-based VAD presets.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
render.yaml Removes ASSEMBLY_AI_API_KEY from Render env configuration.
frontend/public/audio-processor.js Updates comments to reflect Inworld STT + energy-based VAD assumptions (100ms/16kHz).
backend/src/services/graph-service.ts Switches graph initialization to require INWORLD_API_KEY and pass it into graph config.
backend/src/server.ts Updates documentation/log strings to reflect Inworld STT.
backend/src/helpers/connection-manager.ts Removes AssemblyAI session-close behavior; updates comments for Inworld STT.
backend/src/graphs/nodes/transcript-extractor-node.ts Updates documentation to reference InworldSTTNode output.
backend/src/graphs/nodes/inworld-stt-node.ts Adds new STT node with energy-based VAD + REST transcription.
backend/src/graphs/nodes/assembly-ai-stt-ws-node.ts Removes the AssemblyAI streaming WebSocket STT node.
backend/src/graphs/conversation-graph.ts Rewires graph from AssemblyAI STT node to Inworld STT node and updates wrapper/config.
backend/src/config/server.ts Replaces AssemblyAI turn-detection presets with Inworld STT VAD presets + env override.
backend/package-lock.json Lockfile metadata changes (peer flags removed in places).
backend/.env.example Removes ASSEMBLY_AI_API_KEY from example env file.
Files not reviewed (1)
  • backend/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +79 to 83
inworldSTT: {
/** VAD eagerness level */
eagerness: (process.env.INWORLD_STT_EAGERNESS ||
'high') as InworldSTTEagerness,
},
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INWORLD_STT_EAGERNESS is cast to InworldSTTEagerness without validation, so an invalid env value will make getInworldSTTSettings() return undefined and cause downstream runtime errors. Consider validating against {'low','medium','high'} (fallback to 'high' and log a warning) before indexing into inworldSTTPresets.

Copilot uses AI. Check for mistakes.
Comment on lines +248 to +252
const silenceChunksThreshold = Math.ceil(
this.silenceThresholdMs / 100 // chunks are ~100ms each (1600 samples @ 16kHz)
);
const minSpeechChunks = Math.ceil(this.minSpeechMs / 100);

Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VAD chunk thresholds are computed using a hard-coded 100ms assumption (silenceThresholdMs / 100, minSpeechMs / 100), but the node also accepts sampleRate and the actual chunk duration depends on the incoming frame size. To avoid mis-tuned VAD if chunk size/sample rate changes, derive the chunk duration from audioData.length and sampleRate (or assert/enforce 1600@16kHz explicitly).

Copilot uses AI. Check for mistakes.
Comment on lines +460 to +474
private sendPartialTranscript(
sessionId: string,
interactionId: string,
text: string
): void {
const connection = this.connections[sessionId];
if (!connection?.onPartialTranscript) return;

try {
connection.onPartialTranscript(text, interactionId);
} catch (error) {
logger.error({ err: error }, 'error_sending_partial_transcript');
}
}

Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sendPartialTranscript() is declared but never called in this node, which adds dead code and suggests partial transcript support that isn't actually implemented. Either remove it or wire it up (e.g., emit interim text when available) so the class API matches its behavior.

Suggested change
private sendPartialTranscript(
sessionId: string,
interactionId: string,
text: string
): void {
const connection = this.connections[sessionId];
if (!connection?.onPartialTranscript) return;
try {
connection.onPartialTranscript(text, interactionId);
} catch (error) {
logger.error({ err: error }, 'error_sending_partial_transcript');
}
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants