Hello, thank you for your excellent work.
I am trying to fine-tune the checkpoint on the nuScenes dataset, following the depth loss implementation in the VGGT repository for depth and camera transformation prediction:
https://github.com/facebookresearch/vggt/blob/main/training/loss.py
However, I observed that the predicted 'loss_conf_depth' becomes very large for some scenes (while loss_reg_depth and loss_grad_depth seems normal), as shown in the figure. Did you encounter a similar issue when training the model? Do you have any suggestions for stabilizing the depth confidence during fine-tuning?

Hello, thank you for your excellent work.
I am trying to fine-tune the checkpoint on the nuScenes dataset, following the depth loss implementation in the VGGT repository for depth and camera transformation prediction:
https://github.com/facebookresearch/vggt/blob/main/training/loss.py
However, I observed that the predicted 'loss_conf_depth' becomes very large for some scenes (while loss_reg_depth and loss_grad_depth seems normal), as shown in the figure. Did you encounter a similar issue when training the model? Do you have any suggestions for stabilizing the depth confidence during fine-tuning?