OpenMM energy changes after Biotite standardizes atom order and converts back to OpenMM topology
Summary
I am using Biotite to standardize atom order:
PDB -> OpenMM Topology -> Biotite AtomArray -> standardize_order()
-> OpenMM Topology
The bond graph survives the round trip, but two OpenMM-relevant details change:
- OpenMM
Atom.id is not preserved.
ForceField.createSystem() generates different ordered PeriodicTorsionForce tuples from the round-tripped topology.
The unordered torsion atom sets are identical, but the ordered (a, b, c, d) tuples differ. This changes OpenMM torsion energy even when coordinates are reordered with standardized_idx.
Environment
Python 3.11.15
Linux x86_64
biotite 1.6.0
openmm 8.5.1
numpy 2.4.4
Minimal reproducer
This uses a complete OpenMM-compatible protein PDB with hydrogens.
from collections import Counter
import biotite.structure.info as info
import numpy as np
from biotite.interface import openmm as biotite_openmm
from openmm import PeriodicTorsionForce, Platform, VerletIntegrator, app, unit
from openmm.unit import kilojoule_per_mole
PDB_PATH = "complete_protein_with_hydrogens.pdb"
def build_system(topology):
ff = app.ForceField("amber14-all.xml", "implicit/obc2.xml")
return ff.createSystem(
topology,
constraints=app.HBonds,
nonbondedMethod=app.NoCutoff,
soluteDielectric=1.0,
solventDielectric=78.5,
)
def energy_kj(topology, coords_a3_nm):
system = build_system(topology)
simulation = app.Simulation(
topology,
system,
VerletIntegrator(0.002 * unit.picoseconds),
Platform.getPlatformByName("CPU"),
)
simulation.context.setPositions(
np.asarray(coords_a3_nm, dtype=np.float64) * unit.nanometer
)
state = simulation.context.getState(getEnergy=True)
return float(state.getPotentialEnergy().value_in_unit(kilojoule_per_mole))
def torsion_counts(topology, new_to_old=None, sort_atoms=False):
system = build_system(topology)
counts = Counter()
for i in range(system.getNumForces()):
force = system.getForce(i)
if not isinstance(force, PeriodicTorsionForce):
continue
for j in range(force.getNumTorsions()):
a, b, c, d, periodicity, phase, k = force.getTorsionParameters(j)
atoms = (a, b, c, d)
if new_to_old is not None:
atoms = tuple(new_to_old[x] for x in atoms)
if sort_atoms:
atoms = tuple(sorted(atoms))
counts[
(
*atoms,
int(periodicity),
round(float(phase.value_in_unit(unit.radian)), 12),
round(float(k.value_in_unit(unit.kilojoule_per_mole)), 12),
)
] += 1
return counts
pdb = app.PDBFile(PDB_PATH)
orig_top = pdb.topology
atom_array = biotite_openmm.from_topology(orig_top)
standardized_idx = info.standardize_order(atom_array).astype(np.int64, copy=False)
std_top = biotite_openmm.to_topology(atom_array[standardized_idx])
coords = np.asarray(pdb.positions.value_in_unit(unit.nanometer), dtype=np.float64)
orig_energy = energy_kj(orig_top, coords)
std_energy = energy_kj(std_top, coords[standardized_idx])
orig_atoms = list(orig_top.atoms())
std_atoms = list(std_top.atoms())
atom_id_mismatches = [
(new_i, old_i, orig_atoms[old_i].id, std_atoms[new_i].id)
for new_i, old_i in enumerate(standardized_idx.tolist())
if orig_atoms[old_i].id != std_atoms[new_i].id
]
new_to_old = {
new_i: int(old_i) for new_i, old_i in enumerate(standardized_idx.tolist())
}
orig_ordered = torsion_counts(orig_top)
std_ordered = torsion_counts(std_top, new_to_old=new_to_old)
orig_unordered = torsion_counts(orig_top, sort_atoms=True)
std_unordered = torsion_counts(std_top, new_to_old=new_to_old, sort_atoms=True)
print("counts original:", orig_top.getNumAtoms(), orig_top.getNumResidues(), orig_top.getNumChains(), orig_top.getNumBonds())
print("counts standardized:", std_top.getNumAtoms(), std_top.getNumResidues(), std_top.getNumChains(), std_top.getNumBonds())
print("Atom.id mismatches:", len(atom_id_mismatches))
print("energy delta kJ/mol:", std_energy - orig_energy)
print("torsion counts:", sum(orig_ordered.values()), sum(std_ordered.values()))
print("ordered torsion differences:", sum((orig_ordered - std_ordered).values()), sum((std_ordered - orig_ordered).values()))
print("unordered torsion differences:", sum((orig_unordered - std_unordered).values()), sum((std_unordered - orig_unordered).values()))
Observed on my structure:
counts original: 890 atoms, 57 residues, 1 chain, 899 bonds
counts standardized: 890 atoms, 57 residues, 1 chain, 899 bonds
Atom.id mismatches: 889
torsion counts: original=2959, standardized=2959
ordered torsion tuple differences: missing=670, extra=670
unordered torsion atom-set differences: missing=0, extra=0
Energy check with consistently reordered coordinates:
total energy delta: about -0.208 kJ/mol
torsion energy delta: about -0.207 kJ/mol
Bond and angle energies match to numerical precision.
Expected behavior
OpenMM force-field energy should be invariant under atom-order standardization
when the coordinates are reordered by the same index.
In other words, to_topology(from_topology(topology)[standardized_idx]) should
preserve enough OpenMM topology information that ForceField.createSystem()
generates an equivalent system after mapping standardized atom indices back to
the original atom indices.
OpenMM energy changes after Biotite standardizes atom order and converts back to OpenMM topology
Summary
I am using Biotite to standardize atom order:
The bond graph survives the round trip, but two OpenMM-relevant details change:
Atom.idis not preserved.ForceField.createSystem()generates different orderedPeriodicTorsionForcetuples from the round-tripped topology.The unordered torsion atom sets are identical, but the ordered
(a, b, c, d)tuples differ. This changes OpenMM torsion energy even when coordinates are reordered withstandardized_idx.Environment
Minimal reproducer
This uses a complete OpenMM-compatible protein PDB with hydrogens.
Observed on my structure:
Energy check with consistently reordered coordinates:
Bond and angle energies match to numerical precision.
Expected behavior
OpenMM force-field energy should be invariant under atom-order standardization
when the coordinates are reordered by the same index.
In other words,
to_topology(from_topology(topology)[standardized_idx])shouldpreserve enough OpenMM topology information that
ForceField.createSystem()generates an equivalent system after mapping standardized atom indices back to
the original atom indices.