I'm implementing muP for the OLMo model, and am facing an issue with the coordinate check.


The increasing l1 is for the network output. Following the docs, I also set readout init and query init to zero. I also ensure that the initialization is applied after set_base_shapes is called.
What other things can I check to debug the issue?
I'm implementing muP for the OLMo model, and am facing an issue with the coordinate check.
The increasing l1 is for the network output. Following the docs, I also set readout init and query init to zero. I also ensure that the initialization is applied after
set_base_shapesis called.What other things can I check to debug the issue?