Skip to content

Add S3/R2 sync check tool and split SHA256 management out of manage_v2.py#7947

Open
atalman wants to merge 1 commit intopytorch:mainfrom
atalman:fix_s3_wheels_out_of_sync
Open

Add S3/R2 sync check tool and split SHA256 management out of manage_v2.py#7947
atalman wants to merge 1 commit intopytorch:mainfrom
atalman:fix_s3_wheels_out_of_sync

Conversation

@atalman
Copy link
Copy Markdown
Contributor

@atalman atalman commented Apr 9, 2026

Summary

  • Adds s3_management/index_tools.py — a new CLI tool for verifying and fixing
    PyTorch wheel integrity across S3 and Cloudflare R2.
  • Moves all SHA256 checksum management commands (--set-checksum,
    --recompute-sha256-pattern, --recompute-missing-sha256) out of
    manage_v2.py into index_tools.py, so that manage_v2.py is focused
    solely on index HTML generation/upload.
  • manage_v2.py still reports missing SHA256 checksums during index generation.

Motivation

Binaries on S3 and R2 can get out of sync (different SHA256 content hashes for
the same key), causing pip install failures with hash mismatch errors.

Fixes pytorch/pytorch#145501
Fixes pytorch/pytorch#179821

New capabilities in index_tools.py

S3/R2 sync verification (--check-r2-sync):

  • Downloads .whl and .whl.metadata files from both S3 and R2 for a given
    package/version, computes SHA256 from actual content, and reports mismatches
    or files missing on R2.
  • When version has no +, matches all local-version variants (e.g., 2.9.0
    matches cu129, cu124, cpu, etc.).
  • Correctly excludes child prefixes (e.g., whl/ does not scan whl/nightly/
    or whl/test/).

S3→R2 sync repair (--fix-r2-sync):

  • Same as --check-r2-sync but copies mismatched/missing files from S3
    (source of truth) to R2.
  • After copy, purges Cloudflare CDN cache at download-r2.pytorch.org for
    each fixed file (requires CLOUDFLARE_ZONE_ID and CLOUDFLARE_API_TOKEN
    env vars).

Usage examples

# Check sync status for torch 2.9.0 (all CUDA/CPU variants):
python s3_management/index_tools.py whl --check-r2-sync \                                       
    --package-name torch --package-version 2.9.0
                                                                                                
# Check a specific variant:                                                                     
python s3_management/index_tools.py whl --check-r2-sync \
    --package-name torch --package-version 2.9.0+cu129                                          
                                                      
# Fix mismatches (copy S3→R2 + purge CDN cache):                                                
python s3_management/index_tools.py whl --fix-r2-sync \
    --package-name torch --package-version 2.9.0+cu129                                          
                                                      
# SHA256 checksum commands (moved from manage_v2.py):                                           
python s3_management/index_tools.py whl/test --set-checksum \
    --package-name torch --package-version 2.5.0+cu121                                          
python s3_management/index_tools.py whl/nightly --recompute-missing-sha256                      

Test plan                                                                                       
                                                      
- Run --check-r2-sync for a known-good package/version, confirm all show OK                     
- Run --check-r2-sync for torch 2.9.0+cu129 to confirm it detects the
mismatch from issue #179821                                                                     
- Run --fix-r2-sync for the mismatched package, confirm copy succeeds and                       
CDN cache is purged                                                                             
- Re-run --check-r2-sync after fix to confirm all OK                                            
- Run manage_v2.py whl --do-not-upload to confirm index generation still
works and reports missing checksums                                                             
- Run --recompute-missing-sha256 via index_tools.py to confirm moved                            
SHA256 commands still work 

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 9, 2026
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 9, 2026

@atalman is attempting to deploy a commit to the Meta Open Source Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

2 participants