Skip to content

Conversation

@Advaitgaur004
Copy link
Contributor

This PR fixes a tricky bug in our softmax backwards pass, where the gradients weren't matching up with PyTorch Results.

Fixed GradFn_softmax to ensure it correctly computes the full Jacobian-vector product and Tensor_backward to recognise "Softmax" as a special case. Now, it knows to use the gradient from GradFn_softmax directly, skipping the extra multiplication

@Advaitgaur004
Copy link
Contributor Author

@PrimedErwin PrimedErwin merged commit 51b1312 into pocketpy:test Aug 1, 2025
5 checks passed
@Advaitgaur004 Advaitgaur004 deleted the gradfn_softmax_fix branch August 1, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants