Neighborlist database is huge, option not to save DB
The neighborlists are quite fast to calculate (much faster than fingerprints), but take up a huge amount of space relative to the fingerprints. It gets to the many-GB size quite quickly.
Maybe there should be an option to just calculate those on the fly and not use a DB, or not save them to a database after?
Comments (4)
-
-
reporter The fingerprint-derivatives are even larger, but both are much larger than the fingerprint database. For example, a system with ~1000 images has: 65mb fingerprint file, 800mb neighborlist file, 4gb derivates file. Going to 50k images makes the neighborlist far too large to be loaded/unloaded each time (e.g. the first 5-10min of startup could be loading that file). fingerprint-derivatives can be avoided by turning off force training, but there's no way to disable saving of the neighborlists.
-
I just looked back into a earlier instance of training. I saw for 1500 periodic images of 4 species with default cutoff 6.5 Ang, neighborlists is ~137MB, fingerprints is ~144MB, and fingerprint-derivatives is ~10GB. So you may either have a large cutoff or a dense system, so it might be fine to decrease your cutoff.
-
repo owner - changed status to duplicate
Duplicate of #192.
- Log in to comment
Zack: Are you sure it was neighborlists? I suspect it was fingerprint-derivatives?