Skip to content

feat(client): modernize further and modularize#329

Open
HaoZeke wants to merge 12 commits intoTheochemUI:mainfrom
HaoZeke:refactor/neb-modularize-and-integrate
Open

feat(client): modernize further and modularize#329
HaoZeke wants to merge 12 commits intoTheochemUI:mainfrom
HaoZeke:refactor/neb-modularize-and-integrate

Conversation

@HaoZeke
Copy link
Collaborator

@HaoZeke HaoZeke commented Mar 17, 2026

Comprehensive modernization of the eOn C++ client:

  • NEB decomposed into 9 strategy-pattern modules (tangent, projection, spring, OCINEB controller, spline extrema, initial paths, objective function)
  • Parameters god object (2,175 lines) decomposed: NSDMI defaults, INI parsing extracted, vendored CIniFile replaced with inih, JSON serialization via nlohmann/json
  • Narrow parameter passing: Matter, Potential base, Optimizer (OptimizerConfig), Dynamics (DynamicsConfig) no longer require full Parameters object
  • ReplicaDynamicsJob base class for TAD + SafeHyper (~190 lines deduplicated)
  • All jobs use shared_ptr + ForceCallTimer RAII + std::ofstream (no raw new/delete)
  • Eigenmode methods use std::variant instead of abstract base class
  • 7 bug fixes (cellInverse, convergenceForce, confine_positive, maxEnergyImage, LJ cuttOffU, numExtrema, NEB write_movies)
  • 9 dead legacy functions removed from HelperFunctions
  • Test suite migrated from GoogleTest to Catch2 (16 tests)
  • Compiler flags: -O3, -flto=auto, optional -march=native and -ffast-math
  • Documentation: design docs, algorithm selection guide, expanded user guides with rgpycrumbs examples and atomistic-cookbook links
  • 30 towncrier changelog fragments

Implementation details:

  • OCINEB (Off-path Climbing Image NEB) is the published name from Goswami, Gunde, Jonsson (2026) arXiv:2601.12630
  • inih and nlohmann_json added as meson wraps (header-only, no new runtime deps)
  • Deprecated backward-compat constructors retained for Optimizer, Dynamics (transition period)
  • NEB still stores Parameters by value due to downstream pass-through to makeOptimizer/buildEigenmodeStrategy

@HaoZeke HaoZeke changed the title Refactor/neb modularize and integrate feat(client): modernize further and modularize Mar 17, 2026
@github-actions
Copy link

github-actions bot commented Mar 17, 2026

eOn Documentation Preview

Download: documentation.zip

Unzip and open index.html to view.

@github-actions
Copy link

github-actions bot commented Mar 18, 2026

Benchmark Results

Warning

1 benchmark(s) regressed

Count
🔴 Regressed 1
🟢 Improved 2
⚪ Unchanged 5

Regressions

Benchmark Before After Ratio
🔴 bench_eonclient.TimePointMorsePt.time_point_evaluation 6.05±0ms 7.44±0ms 1.23x

Improvements

Benchmark Before After Ratio
🟢 bench_eonclient.TimeMinimizationLJCluster.time_minimization_lbfgs 36.5±0ms 22.9±0ms 0.63x
🟢 bench_eonclient.TimeNEBMorsePt.time_neb 536±0ms 239±0ms 0.45x
5 unchanged benchmark(s)
Benchmark Before After Ratio
bench_eonclient.TimeMinimizationLJCluster.peakmem_minimization_lbfgs 25M 27M ~1.08x
bench_eonclient.TimeNEBMorsePt.peakmem_neb 25M 27.2M ~1.09x
bench_eonclient.TimePointMorsePt.peakmem_point_evaluation 25M 27.2M ~1.09x
bench_eonclient.TimeSaddleSearchMorseDimer.peakmem_saddle_search_dimer 25M 27M ~1.08x
bench_eonclient.TimeSaddleSearchMorseDimer.time_saddle_search_dimer 58.4±0ms 57.6±0ms ~0.99x
Details
  • Base: 31f6c775
  • Head: 17c380a7
  • Runner: ubuntu-22.04
Raw asv-spyglass output
All benchmarks:

| Change   | Before   | After    |   Ratio | Benchmark (Parameter)                                                               |
|----------|----------|----------|---------|-------------------------------------------------------------------------------------|
|          | 25M      | 27M      |    1.08 | bench_eonclient.TimeMinimizationLJCluster.peakmem_minimization_lbfgs [main -> pr]   |
| -        | 36.5±0ms | 22.9±0ms |    0.63 | bench_eonclient.TimeMinimizationLJCluster.time_minimization_lbfgs [main -> pr]      |
|          | 25M      | 27.2M    |    1.09 | bench_eonclient.TimeNEBMorsePt.peakmem_neb [main -> pr]                             |
| -        | 536±0ms  | 239±0ms  |    0.45 | bench_eonclient.TimeNEBMorsePt.time_neb [main -> pr]                                |
|          | 25M      | 27.2M    |    1.09 | bench_eonclient.TimePointMorsePt.peakmem_point_evaluation [main -> pr]              |
| +        | 6.05±0ms | 7.44±0ms |    1.23 | bench_eonclient.TimePointMorsePt.time_point_evaluation [main -> pr]                 |
|          | 25M      | 27M      |    1.08 | bench_eonclient.TimeSaddleSearchMorseDimer.peakmem_saddle_search_dimer [main -> pr] |
|          | 58.4±0ms | 57.6±0ms |    0.99 | bench_eonclient.TimeSaddleSearchMorseDimer.time_saddle_search_dimer [main -> pr]    |

@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch 9 times, most recently from 61b1783 to 889ab41 Compare March 19, 2026 03:46
@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch from cce6e6b to 6c81771 Compare March 20, 2026 00:12
HaoZeke added 7 commits March 22, 2026 18:06
Remove all `using namespace std;` from 60+ non-thirdparty files.
Qualify all bare abs/fabs/sqrt/exp/min/max/pow with std:: prefix.
Replace C-style casts with static_cast, NULL with nullptr, (void)
with (), bool flags with proper bool, == true/false with direct bool.

Fix 15 latent bugs uncovered during modernization:
- DynamicsSaddleSearch snapshot aliasing (shared_ptr copy vs alias)
- BondBoost/LBFGS/Hessian bare abs() integer truncation on doubles
- Prefactor Hessian size check compared min1 vs saddle twice
- TestJob double-free on unique_ptr and use-after-free
- maxAtomMotionV out-of-bounds read on vectors < 3 elements
- LBFGS degenerate curvature abort (now resets memory)
- QSC delete vs delete[] UB, EAM per-loop allocation
- Matter operator= cache invalidation, convel extension bug
- Float precision in QQ-HTST prefactor constants

Performance: eliminate pow() with integer exponents in LJ, EAM,
IDPP, NEB spline (Horner's method), Water_Pt force loops. Replace
division with inverse-multiply in Matter getAccelerations. Vectorize
Matter getKineticEnergy. Pre-compute box reciprocals and r^2 cutoff.

RAII: replace new[]/delete[] with std::vector in Water_Pt and EMT.
Parallel replica dynamics in ReplicaExchangeJob via std::thread
with per-image potential instances.
…ork)

Remove INIFile.cpp/h (superseded by inih header-only subproject),
36 CMakeLists.txt files and Makefile/buildRules.mk (replaced by meson),
old unittests/ framework (ported to Catch2), LAMMPS fakempi.h
(no longer needed with runtime dlopen). ~2600 lines removed.
LammpsLoader singleton loads liblammps at runtime via dlopen/LoadLibrary.
Cross-platform DynLib.h abstraction (Linux/macOS/Windows). LAMMPS
potential always compiled; clear error if library not installed.
Removes with_lammps meson option and -DLAMMPS_POT compile flag.
Highway 1.2.0 via wrap file with cmake defines to disable tests/gtest.
Detected at configure time; sets -DWITH_HIGHWAY when found.
SIMD potential kernels planned but not yet implemented.
…sion outputs

Always compile LAMMPS (runtime dlopen). Add Highway cmake subproject
detection. Move bench_pot to meson benchmark(). Add ci-lammps, ci-xtb,
ci-mta pixi environments. Gitignore regression workdirs and snakemake
metadata.
… SVN coverage

New test files:
- NEBSubsystemTest (20 cases: tangent, spring, projection, DNEB, spline)
- HelperFunctionsTest (9 cases: random, gaussRandom, split_string_int)
- LammpsLoaderTest (2 cases: singleton, availability check)
- PotBenchmark (4 cases: Morse, LJ, PBC, kinetic energy timing)

Enhanced JobIntegrationTest with SVN-verified Si potential point tests
(SW, Tersoff, EDIP, Lenosky -- guarded by WITH_FORTRAN).
Enhanced OptimizerTest with explicit parameter initialization.
Changelog fragments: bugfix-bare-abs, bugfix-lbfgs-curvature,
bugfix-maxatommotionv, bugfix-prefactor-hessian, bugfix-snapshot-aliasing,
dead-code-removal, highway-simd, lammps-runtime, namespace-cleanup,
parallel-replica-exchange, perf-pow-elimination.

Dev beads: approval/fuzz testing, competitive benchmarks, EMT RAII,
Fortran modernization, Highway potential kernels, SafeHyper test,
SVN reference coverage.
@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch 9 times, most recently from 9a86170 to 6b23325 Compare March 23, 2026 11:38
@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch 6 times, most recently from 0409e83 to 1b3a16c Compare March 23, 2026 12:20
HaoZeke added 2 commits March 23, 2026 13:31
…ture

Core NEB refactoring: decompose into strategy pattern with variant-based
dispatch (tangent, spring, projection axes). OCINEB hybrid refinement.
SIDPP improvements (frontier convergence, reparameterization).
Per-image parallel potentials for NEB, Dimer, ImprovedDimer, ProcessSearch.

Parameter decomposition: 33 option-group structs with JSON serialization.
inih-based INI parsing replacing monolithic Parameters god object.

Quality infrastructure: Catch2 test framework, SVN regression reference
data, ASV benchmarks, Quill logging (replacing spdlog).

Test data: .con files for Morse Pt, LJ cluster, Si diamond, Al FCC,
Cu FCC, Fe BCC, and NEB Morse systems with SVN-verified reference values.
@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch 2 times, most recently from 7d87389 to eae9d0a Compare March 23, 2026 12:34
@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch 4 times, most recently from d2ed087 to 23e68a5 Compare March 24, 2026 03:16
Root cause: PyTorch's JIT profiler specializes the computation graph
after the first forward pass. A fresh MetatomicPotential instance
produces ULP-level different forces than one that has already processed
other inputs. In parallel NEB, per-image potentials (each freshly loaded
via makePotential) diverge from the sequential case (shared instance,
warmed after endpoint evaluation).

Reproduced with PET-MAD v1.1.0 on CUDA: neb_000.dat endpoint 9 differs
at 6th decimal between parallel=true and parallel=false. This accumulates
across NEB iterations causing trajectory divergence (different force call
counts and barrier heights).

Fix:
- Disable JIT profiling (torch::jit::getProfilingMode() = false) in
  MetatomicPotential constructor. All instances now use the unoptimized
  interpreter, producing bitwise-identical results regardless of call
  history.
- Only create per-image potentials for intermediate NEB images, not
  endpoints (which are evaluated once during initialization).

Verified: parallel=true and parallel=false produce identical neb.dat
and identical force call counts (419) on the hconh3_cation benchmark.
@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch 8 times, most recently from 5ba7cd7 to d32ccf4 Compare March 24, 2026 11:57
Add Potential::forceBatch() virtual method for evaluating N systems in a
single call. Default implementation loops over force(). MetatomicPotential
overrides with native batching: single model.forward() for all N systems,
single backward() for gradients (each system has independent positions
tensor so gradients naturally separate).

NEB updateForces() uses batched path when pot->supportsBatchEvaluation().
Dimer and ImprovedDimer batch their 2-system evaluations similarly.

Eliminates per-image potential instances, JIT profiling divergence,
N model copies, and threading overhead. One model, one forward pass,
one backward pass.

Also adds Matter::forcesData() and Matter::setComputedPotential() for
external result injection by batched callers.
@HaoZeke HaoZeke force-pushed the refactor/neb-modularize-and-integrate branch from d32ccf4 to 17c380a Compare March 24, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant