MaxText upstream already supports TE quantization, but for optimal performance for NVFP4 and MXFP8, quantization checkpointing support should be upstreamed to MaxText to prevent 2x quantization in fwd and rematerialized in backward,
This work was initially started here but needs continuing: AI-Hypercomputer/maxtext#2773