From cae306f9b21025aebdffc32e4f7a6737c35aed3d Mon Sep 17 00:00:00 2001
From: adil-a <adil.asif2000@hotmail.com>
Date: Tue, 21 Apr 2026 06:14:10 +0000
Subject: [PATCH] fix: bump hf_kl_threshold for
 customizer_llama_3_2_1b_full_sft_chat

After the transformers 5.3 -> 5.5 upgrade (#1734) the vanilla HF Llama 3.2
1B forward diverges slightly from the FSDP2 + kernel-patched training-time
forward at Phase 4 of the checkpoint robustness test. Phase 3 max KL is
still exactly 0 (save/reload is bit-exact), but Phase 4 max KL climbs to
~6.9e-3, overshooting the pre-v5.5 5e-3 threshold.

Bumps ci.checkpoint_robustness.hf_kl_threshold from 5e-3 to 2.5e-2
(~1.5x margin over the observed 6.93e-3), matching the pattern already
applied to gemma_3_270m_squad (#1932) and qwen2_5_7b_squad (#1937).

Evidence on cw-dfw 8xH100 with CI launcher overrides, transformers 5.5.0:
  [Phase 3] max KL = 0.000000e+00
  [Phase 4] max KL = 6.926899e-03 (threshold 2.5e-2)
  [Phase 6] Step 5 / 6 / 7 diff = 0.000000e+00 (3 steps compared)
  1 passed, 24 warnings in 61.42s

Signed-off-by: Adil Asif <adasif@nvidia.com>
Signed-off-by: adil-a <adil.asif2000@hotmail.com>
---
 .../llama3_2/customizer_llama_3_2_1b_full_sft_chat.yaml         | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/llm_finetune/llama3_2/customizer_llama_3_2_1b_full_sft_chat.yaml b/examples/llm_finetune/llama3_2/customizer_llama_3_2_1b_full_sft_chat.yaml
index bb87246a87..9813bb8e66 100644
--- a/examples/llm_finetune/llama3_2/customizer_llama_3_2_1b_full_sft_chat.yaml
+++ b/examples/llm_finetune/llama3_2/customizer_llama_3_2_1b_full_sft_chat.yaml
@@ -169,5 +169,5 @@ ci:
   time: "00:30:00"
   nproc_per_node: 1
   checkpoint_robustness:
-    hf_kl_threshold: 5e-3
+    hf_kl_threshold: 2.5e-2
     tokenizer_name: meta-llama/Llama-3.2-1B-Instruct