Skip to content

Add two-run pattern for video_transcode_bench timed mini to reduce runtime#601

Open
marziehlenjaniMeta wants to merge 1 commit intofacebookresearch:v2-betafrom
marziehlenjaniMeta:export-D102686096-to-v2-beta
Open

Add two-run pattern for video_transcode_bench timed mini to reduce runtime#601
marziehlenjaniMeta wants to merge 1 commit intofacebookresearch:v2-betafrom
marziehlenjaniMeta:export-D102686096-to-v2-beta

Conversation

@marziehlenjaniMeta
Copy link
Copy Markdown

Summary:
The video_transcode_bench_svt_timed_mini benchmark takes ~36 seconds
wall-clock, but only ~15 seconds is actual encoding. The remaining ~19
seconds is spent downscaling all 42 source clips to target resolutions — a
data preparation step, not representative of production workloads.

This diff introduces a two-run pattern (similar to DjangoBench and
SparkBench mini) that separates preparation from measurement:

  • video_transcode_bench_svt_timed_mini_prep: Downscales a sampled subset
    of clips (20%, ~8 clips) and caches them in resized_clips/ for reuse.
    After this run, the ~42GB source dataset can be deleted to reduce OS image
    size.
  • video_transcode_bench_svt_timed_mini_reuse: Skips downscaling entirely,
    reuses cached resized_clips/, and runs only the encoding workload. Errors
    if the cache doesn't exist. Can be run repeatedly.

Additional changes:

  • Split the generated run script so downscaling runs during preprocessing
    (tracked as a downscaling sub-operation in breakdown.csv), and only
    encoding runs during main_benchmark
  • Replaced the sleep+SIGTERM time-limiting mechanism in
    timed_parallel_feeder.sh with a deadline-based feeding loop that stops
    dispatching jobs after max_time
  • Reduced max_time from 15 to 10 seconds for both new variants
  • Added --skip-downscale and --keep-downscaled flags to run.sh
  • Updated README with two-run pattern instructions

Differential Revision: D102686096

…ntime

Summary:
The video_transcode_bench_svt_timed_mini benchmark takes ~36 seconds
wall-clock, but only ~15 seconds is actual encoding. The remaining ~19
  seconds is spent downscaling all 42 source clips to target resolutions — a
   data preparation step, not representative of production workloads.

  This diff introduces a two-run pattern (similar to DjangoBench and
  SparkBench mini) that separates preparation from measurement:

  - video_transcode_bench_svt_timed_mini_prep: Downscales a sampled subset
  of clips (20%, ~8 clips) and caches them in resized_clips/ for reuse.
  After this run, the ~42GB source dataset can be deleted to reduce OS image
   size.
  - video_transcode_bench_svt_timed_mini_reuse: Skips downscaling entirely,
  reuses cached resized_clips/, and runs only the encoding workload. Errors
  if the cache doesn't exist. Can be run repeatedly.

  Additional changes:
  - Split the generated run script so downscaling runs during preprocessing
  (tracked as a downscaling sub-operation in breakdown.csv), and only
  encoding runs during main_benchmark
  - Replaced the sleep+SIGTERM time-limiting mechanism in
  timed_parallel_feeder.sh with a deadline-based feeding loop that stops
  dispatching jobs after max_time
  - Reduced max_time from 15 to 10 seconds for both new variants
  - Added --skip-downscale and --keep-downscaled flags to run.sh
  - Updated README with two-run pattern instructions

Differential Revision: D102686096
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Apr 29, 2026

@marziehlenjaniMeta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D102686096.

meta-codesync Bot pushed a commit that referenced this pull request Apr 29, 2026
…ntime (#601)

Summary:
Pull Request resolved: #601

The video_transcode_bench_svt_timed_mini benchmark takes ~36 seconds
wall-clock, but only ~15 seconds is actual encoding. The remaining ~19
  seconds is spent downscaling all 42 source clips to target resolutions — a
   data preparation step, not representative of production workloads.

  This diff introduces a two-run pattern (similar to DjangoBench and
  SparkBench mini) that separates preparation from measurement:

  - video_transcode_bench_svt_timed_mini_prep: Downscales a sampled subset
  of clips (20%, ~8 clips) and caches them in resized_clips/ for reuse.
  After this run, the ~42GB source dataset can be deleted to reduce OS image
   size.
  - video_transcode_bench_svt_timed_mini_reuse: Skips downscaling entirely,
  reuses cached resized_clips/, and runs only the encoding workload. Errors
  if the cache doesn't exist. Can be run repeatedly.

  Additional changes:
  - Split the generated run script so downscaling runs during preprocessing
  (tracked as a downscaling sub-operation in breakdown.csv), and only
  encoding runs during main_benchmark
  - Replaced the sleep+SIGTERM time-limiting mechanism in
  timed_parallel_feeder.sh with a deadline-based feeding loop that stops
  dispatching jobs after max_time
  - Reduced max_time from 15 to 10 seconds for both new variants
  - Added --skip-downscale and --keep-downscaled flags to run.sh
  - Updated README with two-run pattern instructions

Reviewed By: YifanYuan3

Differential Revision: D102686096

fbshipit-source-id: 777d78f45f9fa747b10bb54ba5876988d9b17eff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant