Releases: choderalab/modelforge
v0.2.1
What's Changed
- Curate: different strategies for generating records with a set maximum number of configurations by @chrisiacovella in #375
- Enable defining a fixed test subset, while still randomizing train/val split by @chrisiacovella in #376
- Improving speed of CI testing by @chrisiacovella in #378
- fix minor bug in indexing in dipole moment by @chrisiacovella in #380
- Quadrupole moment loss by @chrisiacovella in #382
- Minor curation script revisions by @chrisiacovella in #385
- Dataset energy shifting by @chrisiacovella in #388
- epsilon value to norm function in aimnet2 by @chrisiacovella in #391
- Fetching form Zenodo updates by @chrisiacovella in #395
Full Changelog: v0.2.0...v0.2.1
v0.2.0
Some of the notable changes in this release include:
- Overhaul of dataset curation: A new API has been created that focuses on validation at the time of creation. The HDF5 format is also slightly changed providing more information about entries making it easier to process/convert units when reading the files.
- Change to dataset class structure: Datasets are no longer defined in their own unique classes; the main HDF5Dataset class now reads all relevant parameters (self energies, properties available, etc.) from yaml files. These yaml files also provide considerable metadata about each dataset. This revision also allows users to define "local" datasets (i.e., those that are not included in modelforge and hosted on remote servers). Note, older HDF5 data files are now no longer compatible with the dataset class.
- Aimnet2 full implementation: Electrostatic interactions and DFTD3 contributions can now be computed for use with aimnet2. The core of the network has also been updated to provide increased flexibility of defining the number and size of the hidden layers in "MLP" and output layers. Neighborlists that support multiple cutoffs (e.g., needed for electrostatic interactions) are now supported.
PRs associated with this release:
- Adding additional versions of tmqm. by @chrisiacovella in #333
- Period group embedding by @chrisiacovella in #331
- Dataset filter for elements by @MarshallYan in #332
- Fix filtering bug by @chrisiacovella in #337
- Dataset api by @chrisiacovella in #334
- fixed bug in organic element limiting for tmqm by @chrisiacovella in #338
- Dataset api by @chrisiacovella in #340
- Extras for dataset api to make it easier to work with HDF5 datafiles by @chrisiacovella in #346
- Bug fix: dimers + unique_pairs_only by @chrisiacovella in #348
- Add in zbl potential by @chrisiacovella in #351
- Add more datasets (Fe II, T=200K and 300K sampled tmQM-xtb) by @chrisiacovella in #352
- Adding spin state embedding by @chrisiacovella in #354
- Dataset.py refactoring by @chrisiacovella in #358
- aimnet2 updates; additional loss options; multi cutoff neighborlists by @chrisiacovella in #361
- revise tmqm openff curation by @chrisiacovella in #365
- revising aimnet2 mlp by @chrisiacovella in #367
- revise aimnet2 output layer by @chrisiacovella in #370
- Rev curate class name by @chrisiacovella in #372
Full Changelog: v0.1.4...v0.2.0
v0.1.4
This release changes the dataset toml file structure to allow users to modify the properties we are loading from the hdf5 datafiles and how those are being used with the software. This also addresses several other issues (e.g., changing matplotlib backend, tagging of loss calculations in wandb). These are all covered by a single PR: #327
v0.1.3
This is a quick patch of v0.1.2 to remove debug statements that were accidentally left in.
v0.1.2
This release provides some significant changes, including refactoring of neighbor lists, integration with OpenMM, addition of the tmQM dataset, updates to AimNet2 architecture.
Note, checkpoint files and state_dict files have changed since merging of PR #299 (information about whether to use unique pairs is now part of these files). How to load/convert these into the newer versions is covered in the documentation (https://modelforge.readthedocs.io/en/latest/inference.html#load-inference-potential-from-training-checkpoint)
A mostly complete list of the main PRs since past release:
- #322 -- Allows profilers to be toggled via control file
- #319 -- Adds tmQM dataset
- #316 -- Optimize "calculate_radial_contributions" function to reduce GPU memory usage
- #311 -- Adds in functionality to load checkpoint files directly from wandb
- #308 -- Adds epoch time logging
- #307 -- Update to AIMNet2 architecture for radial and vector embedding
- #304 -- Fix a bug in center of mass shifting (required for dipole moment calculation)
- #302 -- Adds regression plots and error histograms
- #301 -- Adding back in unit checking to HDF5 dataset loader, including unit conversion
- #300 -- Make dimensions consistent for predicted properties
- #299 -- OpenMM integration, including substantial refactoring of how neighbor lists are handled for inference.
- #296 -- Adds additional learning rate schedulers beyond step function reduction.
- #295 -- Chang NNPInput structure to allow us to write torch script models
- #294 -- check ani2x against original implementation
- #289 -- Log the gradient norm of loss component and model parameters
- #288 -- Separate training and inference setup in NNP factory
- #287 -- Change input structure from named tuple to class with slots
- #285 -- Add function to generate inference model from training checkpoint file
- #283 -- rename variables in NNP factory class
- #278 -- refactor tests to work with pytest-xdist for parallel test execution
- #275 -- Refactor training neighbor lists
- #268 -- Optimize the training routine
- #263 -- Add function to visualize compute graph of models
- #259 -- formulate paint interactions in a clearer message passing way
- #257 -- refactor inference neighbor lists
v0.1.1
This release provides several improvements and bug fixes.
Major bug fixes:
- Related to calculating losses when training with forces: #239, #240, #243
- PhysNet interaction module: #236
Notable additions:
- AimNet2 added to the available: NNPs #253
- Additional PhAlkEthOH dataset versions, including a version that removes configurations with high energy: #245
- Support to enable multiple cutoffs for models: #238
- Routines for handling long-range electrostatics (following the PhysNet approach): #235
- Charge conservation scheme: #234
v0.1.0
This is the initial release of the modelforge package.
This provides support for training several different Neural Network Potentials, including SchNet and ANI2x (Invariant architectures) and PaiNN, PhysNet, TensorNet, and SAKE (Equivariant architectures) using several curated datasets (QM9, ANI1x, ANI2x, SPICE 1, SPICE 1 openff, SPICE 2, and PhAlkEthOH openff).