-
Notifications
You must be signed in to change notification settings - Fork 10
Add Diarize connector docs #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
9582ef2
add Diarize connector docs with setup guide and usage examples
Pranesh-Raghu c32e663
fix coderabbit: hyphenate Scalekit-optimized, add deadline to polling…
Pranesh-Raghu 5adc6da
remove proxy API calls section, replace with execute_tool pattern
Pranesh-Raghu 6a40990
fix coderabbit: trailing Scalekit tools anchor heading
Pranesh-Raghu fed848f
fix coderabbit: remove stray code fence, fix job id field name in prose
Pranesh-Raghu 5b99886
fix: use uppercase status values and simplify transcript handling in …
Pranesh-Raghu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
78 changes: 78 additions & 0 deletions
78
src/components/templates/agent-connectors/_setup-diarize.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| import { Steps, Aside, Tabs, TabItem } from '@astrojs/starlight/components' | ||
|
|
||
| Register your Diarize API key with Scalekit so it can authenticate and proxy transcription requests on behalf of your users. Unlike OAuth connectors, Diarize uses API key authentication — there is no redirect URI or OAuth flow. | ||
|
|
||
| <Steps> | ||
| 1. ### Get a Diarize API key | ||
|
|
||
| - Sign in to [diarize.io](https://diarize.io) and go to **Settings** → **API Keys**. | ||
|
|
||
| - Click **+ Create New Key**, give it a name (e.g., `Agent Auth`), and confirm. | ||
|
|
||
| - Copy the key value — store it securely, as you will not be able to view it again. | ||
|
|
||
|  | ||
|
|
||
| 2. ### Create a connection in Scalekit | ||
|
|
||
| - In [Scalekit dashboard](https://app.scalekit.com), go to **Agent Auth** → **Create Connection**. Find **Diarize** and click **Create**. | ||
|
|
||
| - Note the **Connection name** — you will use this as `connection_name` in your code (e.g., `diarize`). | ||
|
|
||
| - Click **Save**. | ||
|
|
||
|  | ||
|
|
||
| 3. ### Add a connected account | ||
|
|
||
| Connected accounts link a specific user identifier in your system to a Diarize API key. Add accounts via the dashboard for testing, or via the Scalekit API in production. | ||
|
|
||
| **Via dashboard (for testing)** | ||
|
|
||
| - Open the connection you created and click the **Connected Accounts** tab → **Add account**. | ||
|
|
||
| - Fill in: | ||
| - **Your User's ID** — a unique identifier for this user in your system (e.g., `user_123`) | ||
| - **API Key** — the Diarize API key you copied in step 1 | ||
|
|
||
| - Click **Create Account**. | ||
|
|
||
|  | ||
|
|
||
| **Via API (for production)** | ||
|
|
||
| <Tabs syncKey="tech-stack"> | ||
| <TabItem label="Node.js"> | ||
| ```typescript | ||
| // Never hard-code API keys — read from secure storage or user input | ||
| const diarizeApiKey = getUserDiarizeKey(); // retrieve from your secure store | ||
|
|
||
| await scalekit.actions.upsertConnectedAccount({ | ||
| connectionName: 'diarize', | ||
| identifier: 'user_123', // your user's unique ID | ||
| credentials: { token: diarizeApiKey }, | ||
| }); | ||
| ``` | ||
| </TabItem> | ||
| <TabItem label="Python"> | ||
| ```python | ||
| # Never hard-code API keys — read from secure storage or user input | ||
| diarize_api_key = get_user_diarize_key() # retrieve from your secure store | ||
|
|
||
| scalekit_client.actions.upsert_connected_account( | ||
| connection_name="diarize", | ||
| identifier="user_123", | ||
| credentials={"token": diarize_api_key} | ||
| ) | ||
| ``` | ||
| </TabItem> | ||
| </Tabs> | ||
|
|
||
| <Aside type="tip" title="Production usage tip"> | ||
| In production, call `upsert_connected_account` (Python) / `upsertConnectedAccount` (Node.js) when a user connects their Diarize account — for example, after they paste their API key into a settings page in your app. | ||
| </Aside> | ||
| </Steps> | ||
|
|
||
| <Aside type="note" title="Supported media sources"> | ||
| Diarize supports YouTube, X (Twitter), Instagram, and TikTok URLs. Direct audio or video file URLs are not supported — the URL must point to a public post on one of these platforms. | ||
| </Aside> |
135 changes: 135 additions & 0 deletions
135
src/components/templates/agent-connectors/_usage-diarize.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,135 @@ | ||
| import { Tabs, TabItem, Aside } from '@astrojs/starlight/components' | ||
|
|
||
| Connect a user's Diarize account and transcribe audio and video content through Scalekit tools. Scalekit handles API key storage and tool execution automatically — you never handle credentials in your application code. | ||
|
|
||
| Diarize is primarily used through Scalekit tools. Use `execute_tool` to submit transcription jobs, poll for completion, and download results in any supported format. | ||
|
|
||
| ## Tool calling | ||
|
|
||
| Use this connector when you want an agent to transcribe and diarize audio or video from YouTube, X, Instagram, or TikTok. | ||
|
|
||
| - Use `diarize_create_transcription_job` to submit a URL for transcription. Returns an `id` (job ID) and an `estimatedTime` (in seconds) for how long processing will take. | ||
| - Use `diarize_get_job_status` to poll until `status` is `COMPLETED` or `FAILED`. Use `estimatedTime` to set a sensible timeout — do not give up before that time has elapsed. | ||
| - Use `diarize_download_transcript` to retrieve the result once complete. Choose `json` for structured speaker diarization data, or `txt`, `srt`, `vtt` for plain-text and subtitle formats. | ||
|
|
||
| <Tabs syncKey="tech-stack"> | ||
| <TabItem label="Python"> | ||
| ```python title="examples/diarize_transcribe.py" | ||
| import os, time | ||
| from scalekit.client import ScalekitClient | ||
|
|
||
| scalekit_client = ScalekitClient( | ||
| client_id=os.environ["SCALEKIT_CLIENT_ID"], | ||
| client_secret=os.environ["SCALEKIT_CLIENT_SECRET"], | ||
| env_url=os.environ["SCALEKIT_ENV_URL"], | ||
| ) | ||
|
|
||
| connected_account = scalekit_client.actions.get_or_create_connected_account( | ||
| connection_name="diarize", | ||
| identifier="user_123", | ||
| ).connected_account | ||
|
|
||
| # Step 1: Submit a transcription job | ||
| create_result = scalekit_client.actions.execute_tool( | ||
| tool_name="diarize_create_transcription_job", | ||
| connected_account_id=connected_account.id, | ||
| tool_input={ | ||
| "url": "https://www.youtube.com/watch?v=example", | ||
| "language": "en", # optional — omit for auto-detection | ||
| "num_speakers": 2, # optional — improves speaker diarization | ||
| }, | ||
| ) | ||
| job_id = create_result.result["id"] | ||
| estimated_seconds = create_result.result.get("estimatedTime", 120) | ||
| deadline = time.time() + estimated_seconds * 2 | ||
| print(f"Job {job_id} submitted. Estimated: {estimated_seconds}s") | ||
|
|
||
| # Step 2: Poll until complete | ||
| while True: | ||
| if time.time() > deadline: | ||
| raise TimeoutError(f"Job {job_id} timed out after {estimated_seconds * 2}s") | ||
| time.sleep(15) | ||
| status_result = scalekit_client.actions.execute_tool( | ||
| tool_name="diarize_get_job_status", | ||
| connected_account_id=connected_account.id, | ||
| tool_input={"job_id": job_id}, | ||
| ) | ||
| status = status_result.result["status"] | ||
| print("Status:", status) | ||
| if status == "COMPLETED": | ||
| break | ||
| if status == "FAILED": | ||
| raise RuntimeError(f"Job {job_id} failed") | ||
|
|
||
| # Step 3: Download the diarized transcript | ||
| transcript_result = scalekit_client.actions.execute_tool( | ||
| tool_name="diarize_download_transcript", | ||
| connected_account_id=connected_account.id, | ||
| tool_input={"job_id": job_id, "format": "json"}, | ||
| ) | ||
| # handle the transcript_result | ||
| ``` | ||
| </TabItem> | ||
| <TabItem label="Node.js"> | ||
| ```typescript title="examples/diarize_transcribe.ts" | ||
| import { ScalekitClient } from '@scalekit-sdk/node'; | ||
| import 'dotenv/config'; | ||
|
|
||
| const scalekit = new ScalekitClient( | ||
| process.env.SCALEKIT_ENV_URL!, | ||
| process.env.SCALEKIT_CLIENT_ID!, | ||
| process.env.SCALEKIT_CLIENT_SECRET! | ||
| ); | ||
| const actions = scalekit.actions; | ||
|
|
||
| const { connectedAccount } = await actions.getOrCreateConnectedAccount({ | ||
| connectionName: 'diarize', | ||
| identifier: 'user_123', | ||
| }); | ||
|
|
||
| // Step 1: Submit a transcription job | ||
| const createResult = await actions.executeTool({ | ||
| toolName: 'diarize_create_transcription_job', | ||
| connectedAccountId: connectedAccount.id, | ||
| toolInput: { | ||
| url: 'https://www.youtube.com/watch?v=example', | ||
| language: 'en', // optional — omit for auto-detection | ||
| num_speakers: 2, // optional — improves speaker diarization | ||
| }, | ||
| }); | ||
| const jobId = createResult.data.id; | ||
| const estimatedSeconds = createResult.data.estimatedTime ?? 120; | ||
| const deadline = Date.now() + estimatedSeconds * 2 * 1000; | ||
| console.log(`Job ${jobId} submitted. Estimated: ${estimatedSeconds}s`); | ||
|
|
||
| // Step 2: Poll until complete | ||
| let status = 'PENDING'; | ||
| while (status !== 'COMPLETED' && status !== 'FAILED') { | ||
| if (Date.now() > deadline) throw new Error(`Job ${jobId} timed out after ${estimatedSeconds * 2}s`); | ||
| await new Promise(r => setTimeout(r, 15_000)); | ||
| const statusResult = await actions.executeTool({ | ||
| toolName: 'diarize_get_job_status', | ||
| connectedAccountId: connectedAccount.id, | ||
| toolInput: { job_id: jobId }, | ||
| }); | ||
| status = statusResult.data.status; | ||
| console.log('Status:', status); | ||
| } | ||
| if (status === 'FAILED') throw new Error(`Job ${jobId} failed`); | ||
|
|
||
| // Step 3: Download the diarized transcript | ||
| const transcriptResult = await actions.executeTool({ | ||
| toolName: 'diarize_download_transcript', | ||
| connectedAccountId: connectedAccount.id, | ||
| toolInput: { job_id: jobId, format: 'json' }, | ||
| }); | ||
| // handle the transcriptResult | ||
| ``` | ||
| </TabItem> | ||
| </Tabs> | ||
|
|
||
| <Aside type="note" title="Polling guidance"> | ||
| The `estimatedTime` field (in seconds) tells you how long processing is expected to take. For a 49-minute episode, `estimatedTime` may be around 891 seconds (~15 minutes). Wait at least that long before treating the job as timed out. | ||
| </Aside> | ||
|
|
||
| ## Scalekit tools | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.