Fix memory scaling calculation for non-periodic boundary conditions#412
Fix memory scaling calculation for non-periodic boundary conditions#412orionarcher merged 5 commits intomainfrom
Conversation
| if all(state.pbc): | ||
| volume = torch.abs(torch.linalg.det(state.cell[0])) / 1000 | ||
| else: | ||
| bbox = state.positions.max(dim=0).values - state.positions.min(dim=0).values |
There was a problem hiding this comment.
would using this as the general case fail? for example if we had a 2d system or surface with a lot of vacuum the cell is not useful for determining the density?
There was a problem hiding this comment.
Good point. Neither is perfect in those cases but I agree a bounding box is better than cell. I am happy to make it the general case. Does a clamp value of 2 A make sense to you? Needed for flat molecules like benzene (though benzene isn't actually flat).
There was a problem hiding this comment.
Clamp is harder to reason about because there are some systems say metallic lithium where the unit cell is ~1A.
There was a problem hiding this comment.
I guess the limiting case is something flat anthracene in a vacuum. If you have two molecules far enough apart then this heuristic would fail.
n_atoms*density is just to say it scales as the number of nearest neighbors? why not call a nl algorithm to estimate and go based on that?
There was a problem hiding this comment.
I think that's right but I am not quite intuiting how that would resolve the 2D vs 3D tradeoff. What would that look like in practice, call the neighbor list with a 5-6 A cutoff and then calculate number_density from that?
A couple drawbacks:
- a switch wouldn't be backwards compatible, any users saved
n_atoms_x_densitymetrics would need to be recomputed (though that shouldn't stop us) - it's more expensive and needs to be executed on every system
There was a problem hiding this comment.
I think we should probably use a bounding box algorithm to start (it's very fast&simple logic).
If we do want to do something more complicated (using neighborlists), we'd either have to calculate it twice or refactor a bit of code to get it to work well (since I think we determine the batches before we calculate the neighbors). If we do want to support neighborlist-based memory scaling, we'd probably make a new memory_scales_with kind
There was a problem hiding this comment.
I agree, I'll just do BB and add 2 A in every non-periodic direction to account for 2D systems and slabs
| if memory_scales_with == "n_atoms_x_density": | ||
| volume = torch.abs(torch.linalg.det(state.cell[0])) / 1000 | ||
| if all(state.pbc): | ||
| volume = torch.abs(torch.linalg.det(state.cell[0])) / 1000 |
There was a problem hiding this comment.
ik it's not added in this PR, but this 1000 feels like a magic number to me. we should at minimum have a comment explaining it - or make it configurable at most.
There was a problem hiding this comment.
ill add a comment, once I remember what the 1000 is for... I think it's an A^3 -> nm^3 conversion
|
I've implemented the bounding box solution + 2 A in every non-periodic direction. I think thats a good intermediate solution that covers the 2D and slab cases reasonably well. |
There was a problem hiding this comment.
I think we should just remove:
if all(state.pbc):
volume = torch.abs(torch.linalg.det(state.cell[0])) / 1000
else:
and just always use the for loop you added to calculate the bbox.
This is because when people make a vacuum (e.g. for a catalyst simulation) they may just say that all axes are periodic (when only the x and z axis are periodic). In that case, we will incorrectly calculate the volume
Summary
When the MACE model (or other
n_atoms_x_densitybased models) were run with non-periodic systems, they would fail because the cell is zeroed out. I am replacing the volume calculation with a bounding box of the positions. Is this a reasonable solution?Checklist
Before a pull request can be merged, the following items must be checked: