Implement lbfgs and bfgs#365
Conversation
|
See detailed comment here. |
orionarcher
left a comment
There was a problem hiding this comment.
Going to leave the physics review to someone else, just one API comment
curtischong
left a comment
There was a problem hiding this comment.
Just learned the algorithms today so I may have missed things. Great work, especially because the ase implementation isn't very documented.
| sys_max = torch.zeros(state.n_systems, device=device, dtype=dtype) | ||
| sys_max.scatter_reduce_(0, state.system_idx, norms, reduce="amax", include_self=False) | ||
|
|
||
| # Scaling factors per system: <= 1.0 |
There was a problem hiding this comment.
consider saying: "scale down step so atoms move at most max_step" or "# Scale step if it exceeds max_step" like you mention in bfgs
|
Thanks for the comments. I will update those over the weekend. |
|
@abhijeetgangan is this still on the docket for you? I know it's a much requested feature so it'd be awesome to get it in. Thanks for your time on it. |
|
Will prioritize this on the weekend. |
|
Hi @sihoonchoi and @Andrew-S-Rosen the integration is pretty much done. I also tested on 1K structures on WBM with mace comparing FIRE and L-BFGS (with cell filters): For fmax of 0.02 eV/A For fmax of 1e-4 eV/A Clearly, here it seem that FIRE can struggle sometimes to relax especially for small fmax. It did not finish all even after 500 steps. BFGS is working but need linesearch to the competitive. That's going to be a separate PR. I am going to merge this as the tests are very comprehensive but would be nice if someone can try it out. |
|
@abhijeetgangan I've tried L-BFGS and BFGS along with FIRE with UMA-OMC on 100 structures and it works well. This is great - thanks for your work! Below is a comparison with serial relaxations in ASE. |
|
@sihoonchoi Thanks for looking into it. This is very valuable! There are more improvements that can be made on top of this like introducing line-search and trust regions. Some of which is batching friendly and some of it is not. I am pretty sure the optimization algorithms can be made more robust with it (ASE doesn't use many of these known tricks). I would be curious on what the community thinks but given that there are no benchmarks in matsci it's hard to make comparisons. Maybe guys at Rowan can set this up? Would like to know what others think. |
|
I personally find it a bit wild that we don't have good benchmarks on this, but I guess it's historically because DFT is expensive to run. Rowan doesn't have a ton of infra for materials but they have obviously thought about this. Actually, my group has a funded NSF collaboration with the Argonne folks supporting TorchSim (Ben Blaiszik et al) that might be relevant here. That might be a good avenue to kickstart this. I'd be interested in seeing if we can start something through that and open it up to the community to participate. |
|
One thing I always found confusing was when comparing MLIPs vs DFT for relaxation. MLIPs would use FIRE but VASP uses CG and QE uses BFGS both with linesearch and trust regions which makes them very different. Another aspect is the cost. I am sure the devs of DFT codes must have chosen this so that it reduces the DFT calls during relaxation. I think a fair comparison would be ideal but that means going away from exact ASE behavior.
I think this is a great idea! Maybe we can start an email thread to discuss this? I can add more optimizers to test. |
I think the reason people have been using MLIPs with FIRE is simply a cultural thing, not based on much other than the original FIRE paper. FIRE isn't in VASP (without external plugins), but yeah, CG is simple enough and works.
Sure, happy to kick off an email tomorrow. Can you send me an email from your account? I'm not sure what your email is. |
Sounds good! abhijeetgangan@g.ucla.edu |
Summary
Checklist
Before a pull request can be merged, the following items must be checked: