Add two-run pattern for video_transcode_bench timed mini to reduce runtime by marziehlenjaniMeta · Pull Request #601 · facebookresearch/DCPerf

marziehlenjaniMeta · 2026-04-29T21:02:05Z

Summary:
The video_transcode_bench_svt_timed_mini benchmark takes ~36 seconds
wall-clock, but only ~15 seconds is actual encoding. The remaining ~19
seconds is spent downscaling all 42 source clips to target resolutions — a
data preparation step, not representative of production workloads.

This diff introduces a two-run pattern (similar to DjangoBench and
SparkBench mini) that separates preparation from measurement:

video_transcode_bench_svt_timed_mini_prep: Downscales a sampled subset
of clips (20%, ~8 clips) and caches them in resized_clips/ for reuse.
After this run, the ~42GB source dataset can be deleted to reduce OS image
size.
video_transcode_bench_svt_timed_mini_reuse: Skips downscaling entirely,
reuses cached resized_clips/, and runs only the encoding workload. Errors
if the cache doesn't exist. Can be run repeatedly.

Additional changes:

Split the generated run script so downscaling runs during preprocessing
(tracked as a downscaling sub-operation in breakdown.csv), and only
encoding runs during main_benchmark
Replaced the sleep+SIGTERM time-limiting mechanism in
timed_parallel_feeder.sh with a deadline-based feeding loop that stops
dispatching jobs after max_time
Reduced max_time from 15 to 10 seconds for both new variants
Added --skip-downscale and --keep-downscaled flags to run.sh
Updated README with two-run pattern instructions

Differential Revision: D102686096

…ntime Summary: The video_transcode_bench_svt_timed_mini benchmark takes ~36 seconds wall-clock, but only ~15 seconds is actual encoding. The remaining ~19 seconds is spent downscaling all 42 source clips to target resolutions — a data preparation step, not representative of production workloads. This diff introduces a two-run pattern (similar to DjangoBench and SparkBench mini) that separates preparation from measurement: - video_transcode_bench_svt_timed_mini_prep: Downscales a sampled subset of clips (20%, ~8 clips) and caches them in resized_clips/ for reuse. After this run, the ~42GB source dataset can be deleted to reduce OS image size. - video_transcode_bench_svt_timed_mini_reuse: Skips downscaling entirely, reuses cached resized_clips/, and runs only the encoding workload. Errors if the cache doesn't exist. Can be run repeatedly. Additional changes: - Split the generated run script so downscaling runs during preprocessing (tracked as a downscaling sub-operation in breakdown.csv), and only encoding runs during main_benchmark - Replaced the sleep+SIGTERM time-limiting mechanism in timed_parallel_feeder.sh with a deadline-based feeding loop that stops dispatching jobs after max_time - Reduced max_time from 15 to 10 seconds for both new variants - Added --skip-downscale and --keep-downscaled flags to run.sh - Updated README with two-run pattern instructions Differential Revision: D102686096

meta-codesync · 2026-04-29T21:02:13Z

@marziehlenjaniMeta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D102686096.

…ntime (#601) Summary: Pull Request resolved: #601 The video_transcode_bench_svt_timed_mini benchmark takes ~36 seconds wall-clock, but only ~15 seconds is actual encoding. The remaining ~19 seconds is spent downscaling all 42 source clips to target resolutions — a data preparation step, not representative of production workloads. This diff introduces a two-run pattern (similar to DjangoBench and SparkBench mini) that separates preparation from measurement: - video_transcode_bench_svt_timed_mini_prep: Downscales a sampled subset of clips (20%, ~8 clips) and caches them in resized_clips/ for reuse. After this run, the ~42GB source dataset can be deleted to reduce OS image size. - video_transcode_bench_svt_timed_mini_reuse: Skips downscaling entirely, reuses cached resized_clips/, and runs only the encoding workload. Errors if the cache doesn't exist. Can be run repeatedly. Additional changes: - Split the generated run script so downscaling runs during preprocessing (tracked as a downscaling sub-operation in breakdown.csv), and only encoding runs during main_benchmark - Replaced the sleep+SIGTERM time-limiting mechanism in timed_parallel_feeder.sh with a deadline-based feeding loop that stops dispatching jobs after max_time - Reduced max_time from 15 to 10 seconds for both new variants - Added --skip-downscale and --keep-downscaled flags to run.sh - Updated README with two-run pattern instructions Reviewed By: YifanYuan3 Differential Revision: D102686096 fbshipit-source-id: 777d78f45f9fa747b10bb54ba5876988d9b17eff

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2026

meta-codesync Bot added fb-exported meta-exported labels Apr 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add two-run pattern for video_transcode_bench timed mini to reduce runtime#601

Add two-run pattern for video_transcode_bench timed mini to reduce runtime#601
marziehlenjaniMeta wants to merge 1 commit intofacebookresearch:v2-betafrom
marziehlenjaniMeta:export-D102686096-to-v2-beta

marziehlenjaniMeta commented Apr 29, 2026

Uh oh!

meta-codesync Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

marziehlenjaniMeta commented Apr 29, 2026

Uh oh!

meta-codesync Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant