Skip to content

Conversation

@tukwila
Copy link
Contributor

@tukwila tukwila commented Dec 25, 2025

Summary

fix: #522

Details

  • [ ]

Test Plan

guidellm benchmark
--target "http://localhost:8080"
--request-type "audio_translations"
--rate-type "throughput"
--rate 1
--max-requests 1
--data "./audio_metadata.csv"

request_id='611ef42c-d249-4f31-87b1-3dca5ef90007' status='queued'
scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766652021.6886919
timings=RequestTimings(targeted_start=None, queued=1766652023.272201,
dequeued=None, scheduled_at=None, resolve_start=None, request_start=None,
first_request_iteration=None, first_token_iteration=None,
last_token_iteration=None, last_request_iteration=None, request_iterations=0,
token_iterations=0, request_end=None, resolve_end=None, finalized=None)
error=None started_at=None completed_at=None


 request_id='611ef42c-d249-4f31-87b1-3dca5ef90007' status='pending'
scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766652021.6886919
timings=RequestTimings(targeted_start=1766652021.6886919,
queued=1766652023.272201, dequeued=1766652023.274364, scheduled_at=None,
resolve_start=None, request_start=None, first_request_iteration=None,
first_token_iteration=None, last_token_iteration=None,
last_request_iteration=None, request_iterations=0, token_iterations=0,
request_end=None, resolve_end=None, finalized=None) error=None started_at=None
completed_at=None


 request_id='611ef42c-d249-4f31-87b1-3dca5ef90007' status='in_progress'
scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766652021.6886919
timings=RequestTimings(targeted_start=1766652021.6886919,
queued=1766652023.272201, dequeued=1766652023.274364,
scheduled_at=1766652023.274364, resolve_start=1766652023.274515,
request_start=None, first_request_iteration=None, first_token_iteration=None,
last_token_iteration=None, last_request_iteration=None, request_iterations=0,
token_iterations=0, request_end=None, resolve_end=None, finalized=None)
error=None started_at=1766652023.274515 completed_at=None


 request_id='611ef42c-d249-4f31-87b1-3dca5ef90007' status='completed'
scheduler_node_id=-1 scheduler_process_id=0
scheduler_start_time=1766652021.6886919
timings=RequestTimings(targeted_start=1766652021.6886919,
queued=1766652023.272201, dequeued=1766652023.274364,
scheduled_at=1766652023.274364, resolve_start=1766652023.274515,
request_start=1766652023.2745988, first_request_iteration=1766652023.282783,
first_token_iteration=1766652023.282783, last_token_iteration=1766652023.282783,
last_request_iteration=1766652023.282783, request_iterations=1,
token_iterations=1, request_end=1766652023.282867,
resolve_end=1766652023.283092, finalized=1766652023.284126) error=None
started_at=1766652023.2745988 completed_at=1766652023.282867
╭─ Benchmarks ─────────────────────────────────────────────────────────────────╮
│ [1… thr… (c… Req:    0.6 req/s,    0.01s Lat,     0.0 Conc,       1 Comp,  … │
│              Tok:    0.0 gen/s,    0.0 tot/s,   8.2ms TTFT,    0.0ms ITL,  … │
╰──────────────────────────────────────────────────────────────────────────────╯
Generating... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (1/1) [ 0:00:02 < 0:00:00 ]


ℹ Run Summary Info
|============|==========|==========|=====|======|======|======|=====|=====|======|=====|=====|
| Benchmark  | Timings                             ||||| Input Tokens   ||| Output Tokens  |||
| Strategy   | Start    | End      | Dur | Warm | Cool | Comp | Inc | Err | Comp | Inc | Err |
|            |          |          | Sec | Sec  | Sec  | Tot  | Tot | Tot | Tot  | Tot | Tot |
|------------|----------|----------|-----|------|------|------|-----|-----|------|-----|-----|
| throughput | 16:40:21 | 16:40:23 | 1.6 | 0.0  | 0.0  | 0.0  | 0.0 | 0.0 | 0.0  | 0.0 | 0.0 |
|============|==========|==========|=====|======|======|======|=====|=====|======|=====|=====|


ℹ Text Metrics Statistics (Completed Requests)
|============|=======|======|======|======|=======|======|======|======|
| Benchmark  | Output Words            |||| Output Characters       ||||
| Strategy   | Per Request || Per Second || Per Request || Per Second ||
|            | Mdn   | p95  | Mdn  | Mean | Mdn   | p95  | Mdn  | Mean |
|------------|-------|------|------|------|-------|------|------|------|
| throughput | 6.0   | 6.0  | 0.0  | 0.0  | 34.0  | 34.0 | 0.0  | 0.0  |
|============|=======|======|======|======|=======|======|======|======|


ℹ Audio Metrics Statistics (Completed Requests)
|============|=========|=========|======|======|=======|======|======|======|=========|=========|======|======|
| Benchmark  | Input Samples                |||| Input Seconds           |||| Input Bytes                  ||||
| Strategy   | Per Request      || Per Second || Per Request || Per Second || Per Request      || Per Second ||
|            | Mdn     | p95     | Mdn  | Mean | Mdn   | p95  | Mdn  | Mean | Mdn     | p95     | Mdn  | Mean |
|------------|---------|---------|------|------|-------|------|------|------|---------|---------|------|------|
| throughput | 44100.0 | 44100.0 | 0.0  | 0.0  | 3.0   | 3.0  | 0.0  | 0.0  | 25101.0 | 25101.0 | 0.0  | 0.0  |
|============|=========|=========|======|======|=======|======|======|======|=========|=========|======|======|


ℹ Request Token Statistics (Completed Requests)
|============|======|=====|======|======|======|=====|=======|======|=========|========|
| Benchmark  | Input Tok || Output Tok || Total Tok || Stream Iter || Output Tok      ||
| Strategy   | Per Req   || Per Req    || Per Req   || Per Req     || Per Stream Iter ||
|            | Mdn  | p95 | Mdn  | p95  | Mdn  | p95 | Mdn   | p95  | Mdn     | p95    |
|------------|------|-----|------|------|------|-----|-------|------|---------|--------|
| throughput | 0.0  | 0.0 | 0.0  | 0.0  | 0.0  | 0.0 | 1.0   | 1.0  | 0.0     | 0.0    |
|============|======|=====|======|======|======|=====|=======|======|=========|========|


ℹ Request Latency Statistics (Completed Requests)
|============|=========|========|=====|=====|=====|=====|=====|=====|
| Benchmark  | Request Latency || TTFT     || ITL      || TPOT     ||
| Strategy   | Sec             || ms       || ms       || ms       ||
|            | Mdn     | p95    | Mdn | p95 | Mdn | p95 | Mdn | p95 |
|------------|---------|--------|-----|-----|-----|-----|-----|-----|
| throughput | 0.0     | 0.0    | 8.2 | 8.2 | 0.0 | 0.0 | 0.0 | 0.0 |
|============|=========|========|=====|=====|=====|=====|=====|=====|


ℹ Server Throughput Statistics
|============|=====|======|=======|======|=======|=======|========|=======|=======|=======|
| Benchmark  | Requests               |||| Input Tokens || Output Tokens || Total Tokens ||
| Strategy   | Per Sec   || Concurrency || Per Sec      || Per Sec       || Per Sec      ||
|            | Mdn | Mean | Mdn   | Mean | Mdn   | Mean  | Mdn    | Mean  | Mdn   | Mean  |
|------------|-----|------|-------|------|-------|-------|--------|-------|-------|-------|
| throughput | 1.3 | 0.6  | 0.0   | 0.0  | 0.0   | 0.0   | 0.0    | 0.0   | 0.0   | 0.0   |
|============|=====|======|=======|======|=======|=======|========|=======|=======|=======|



✔ Benchmarking complete, generated 1 benchmark(s)

No error and status='completed'.

Related Issues

  • Resolves #

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: guangli.bao <guangli.bao@daocloud.io>
@dbutenhof dbutenhof added this to the v0.5.1 milestone Jan 7, 2026
@sjmonson sjmonson requested a review from JennLM January 7, 2026 20:04
@sjmonson sjmonson self-assigned this Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

benchmark on audio_transcriptions fails

4 participants