Skip to content

Add Binary file support, update docker setup and small fixes#90

Open
RealUranar wants to merge 6 commits into
andreagrisafi:masterfrom
RealUranar:master
Open

Add Binary file support, update docker setup and small fixes#90
RealUranar wants to merge 6 commits into
andreagrisafi:masterfrom
RealUranar:master

Conversation

@RealUranar
Copy link
Copy Markdown
Collaborator

This pull request introduces several important improvements and updates to the SALTED project, focusing on enhanced binary model deployment, a more robust and portable Docker build for HPC environments, and some internal code cleanups and refactoring. The most significant changes are grouped below by theme.

1. Binary Model Deployment and Documentation

  • Added comprehensive documentation (binary_models.md) for the new binary .salted model format, including usage instructions, deployment, and detailed file structure/serialization scheme. This enables easier sharing and deployment of trained models.
  • Updated the documentation navigation (mkdocs.yaml) to include the new "Binary Deployment" and "Docker" sections, making these resources more discoverable.

2. Dockerfile and HPC/Cluster Support

  • Overhauled the Dockerfile to improve build reproducibility and compatibility with HPC clusters: switched to Open MPI 4.1.8 with PMI2/Slurm support, ensured all dependencies are installed up front, and built mpi4py and h5py against the correct MPI/HDF5 stack. This ensures the container works seamlessly with parallel workloads and Slurm clusters.
  • Updated the Docker/Apptainer documentation (docker.md) to provide a clearer, more robust workflow for building and deploying the container image on HPC clusters, including explicit instructions for using Podman and Apptainer with the correct image format.

3. Codebase Refactoring and Cleanup

  • Modified the read_system function in sys_utils.py to support reading embedded basis data from binary .salted models, enabling more flexible and portable model loading.
  • Removed the calculation and saving of the "projections" output in pyscf/dm2df.py, simplifying the data pipeline and storage requirements.
  • Added memory cleanup in sparsify_features.py by deleting large temporary arrays to improve memory efficiency.

These changes collectively improve the usability, portability, and maintainability of the SALTED codebase, especially for deployment on HPC systems and for sharing pretrained models.

@andreagrisafi andreagrisafi requested review from andreagrisafi and zekunlou and removed request for andreagrisafi May 7, 2026 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant