xyzgraph is a Python toolkit for building molecular graphs (bond connectivity, bond orders, formal charges, and partial charges) directly from 3D atomic coordinates in XYZ format. It provides both cheminformatics-based and quantum chemistry-based (xTB) workflows.
- Key Features
- Installation
- Quick Start
- Methodology Overview
- Workflow Comparison
- CLI Reference
- Python API
- Visualization
- Non-Covalent Interactions
- Limitations & Future Work
- Examples
- References
- Contributing & Contact
- Distance-based initial bonding using consistent van der Waals radii across all elements from Charry and Tkatchenko [1]
- Four construction methods:
- Cheminformatics modes:
--quick: Geometric detection only — no bond order assignment, formal charges, or aromatic detection--optimizer: Full optimization with valence and charge minimisation- beam: optimization across multiple paths (slightly slower, default)
- greedy: iterative valence adjustment
- Aromatic detection: Hückel 4n+2 rule for 5/6-membered rings, plus a fused-perimeter fallback for non-alternant systems (azulene, acenaphthylene). Optional
--kekuleto keep Kekulé bond orders - Charge computation: Gasteiger (cheminf) or Mulliken (xTB/ORCA) partial charges
- Stereochemistry assignment: R/S, E/Z, axial (biaryl, allene, metallocene), planar (metallocene, paracyclophane), and helical (helicene) chirality from 3D geometry
- RDkit/xyz2mol comparison validation against RDKit bond perception [3], [4]
- Non-covalent interaction (NCI) detection: 17 interaction types including hydrogen bonds, pi-stacking, halogen/chalcogen/pnictogen bonds, cation-pi, and more
- ASCII 2D depiction with layout alignment for method comparison (see also [5])
pip install xyzgraphgit clone https://github.com/aligfellow/xyzgraph.git
cd xyzgraph
pip install .
# or simply
pip install git+https://github.com/aligfellow/xyzgraph.git- Core:
numpy,networkx,rdkit - Optional: xTB binary (for
--method xtb) - Optional: xyz2mol_tm +
scipy(for--compare-rdkit-tm)
To install xTB (Linux/macOS) see here:
conda install -c conda-forge xtb # or download from GitHub releasesTo install xyz2mol_tm (required for --compare-rdkit-tm):
pip install "xyzgraph[rdkit-tm]" xyz2mol_tm@git+https://github.com/jensengroup/xyz2mol_tm.gitThis installs scipy (via the rdkit-tm extra) and xyz2mol_tm from source in one command. This extra step is necessary because xyz2mol_tm is not hosted on pypi.
Minimal usage (auto-displays ASCII depiction):
xyzgraph molecule.xyz # constructs graph with cheminformatics style defaults
xyzgraph molecule.out # constructs graph from ORCA outputSpecify charge and method:
xyzgraph molecule.xyz --method xtb --charge -1 --multiplicity 2Detailed debug output:
xyzgraph molecule.xyz --debugCompare with RDKit:
xyzgraph molecule.xyz --compare-rdkitCompare with ORCA output:
# Compare XYZ (cheminf) vs ORCA bond orders
xyzgraph molecule.xyz --orca-out molecule.out
# Three-way comparison: cheminf vs ORCA vs RDKit
xyzgraph molecule.xyz --orca-out molecule.out --compare-rdkitDetect non-covalent interactions:
xyzgraph molecule.xyz --nci --charge 1Multi-frame trajectory files:
# Process specific frame from trajectory (0-indexed)
xyzgraph trajectory.xyz --frame 2
# Process all frames for quick topological overview
xyzgraph trajectory.xyz --all-framesBasic usage:
from xyzgraph import build_graph, build_graph_rdkit, build_graph_orca
# Cheminformatics (default method)
G_cheminf = build_graph("molecule.xyz", charge=0)
# RDKit's DetermineBonds
G_rdkit = build_graph_rdkit("molecule.xyz", charge=0)
# ORCA output (Mayer bond orders)
G_orca = build_graph_orca("structure.out", bond_threshold=0.5)
# Print ASCII structure
from xyzgraph import graph_to_ascii
print(graph_to_ascii(G_cheminf, scale=3.0, include_h=False))Multi-frame trajectory files:
from xyzgraph import read_xyz_file, build_graph
# Read specific frame from trajectory
atoms = read_xyz_file("trajectory.xyz", frame=2)
G = build_graph(atoms, charge=0)
# Process all frames
from xyzgraph import count_frames_and_atoms
num_frames, _ = count_frames_and_atoms("trajectory.xyz")
for i in range(num_frames):
atoms = read_xyz_file("trajectory.xyz", frame=i)
G = build_graph(atoms, charge=0)
# ... analyze GNon-covalent interaction detection:
from xyzgraph import build_graph
from xyzgraph.nci import detect_ncis
G = build_graph("molecule.xyz", charge=1)
ncis = detect_ncis(G)
for nci in ncis:
print(nci.type, nci.site_a, nci.site_b, nci.geometry)Comparing methods:
from xyzgraph import compare_with_rdkit
# Build graphs
G_cheminf = build_graph("molecule.xyz", charge=-1)
G_rdkit = build_graph_rdkit("molecule.xyz", charge=-1)
# Compare (returns formatted report)
report = compare_with_rdkit(G_cheminf, G_rdkit, verbose=True, ascii=True)
print(report)xyzgraph offers two distinct pathways for molecular graph construction:
-
Cheminformatics Path (
method='cheminf'):- Pure graph-based approach using chemical heuristics
- No external quantum chemistry calls
- Cached scoring, valence, edge and graph properties
- Fast and suitable for both organic and inorganic molecules
-
Quantum Chemistry Path (
method='xtb'):- Uses GFN2-xTB (extended tight-binding) calculations [2]
- Reads in Wiberg bond orders and Mulliken charges from output
- Potentially more accurate for unusual bonding situations
- though, xTB may be less robust in these situations
- Requires xTB binary installation
┌─────────────────────────────────────────────────────────────────┐
│ 1. Input Processing │
│ • Parse XYZ file internally │
│ • Load reference data (VDW radii, valences, electrons) │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 2. Initial Bond Graph (Two-Step Construction) │
│ │
│ Step 1: Baseline Bonds (DEFAULT thresholds) │
│ • Uses DEFAULT threshold parameters (threshold=1.0) │
│ • Builds reliable "core" connectivity │
│ • Bonds sorted by confidence: 1.0 (short) to 0.0 (at thresh) │
│ • High confidence (>0.4): added directly │
│ • Low confidence (≤0.4): geometric validation applied │
│ • Result: stable molecular scaffold │
│ • Compute rings via shortest-cycle-per-edge BFS (SSSR-like, │
│ chemically natural smallest rings; avoids 11-ring or │
│ larger artefacts from a plain cycle basis) │
│ │
│ Step 2: Extended Bonds (if using CUSTOM thresholds) │
│ • Sorted highest-confidence-first (most reliable first) │
│ • Additional bonds require geometric validation: │
│ - Acute angle check: 15° (metals) / 30° (non-metals) │
│ - Collinearity check: trans vs spurious detection │
│ - Existing ring diagonal rejection and 3-ring validation │
| - Agostic bond filtering: H-M/F-M bonds rejected if │
│ stronger H-X or F-X bond exists (2x confidence ratio) │
│ - M-L priority check: diagonal M-ligand bonds in 3-rings │
│ rejected if stronger M-donor bond exists in ring (2x) │
│ • Allows sensible elongated bonds (e.g., TS structures) │
│ │
│ • Create graph with single bonds (order = 1.0) │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 3. Kekulé Initialization for Conjugated Rings │
│ • Find 5/6-membered planar rings with C/N/O/S/B/P/Se │
│ • Initialize alternating bond orders (5-ring: 2-1-2-1-1, │
│ 6-ring: 2-1-2-1-2-1) │
│ • Priority passes: Cp-like metal-bound 5-rings, heteroatom │
│ 5-rings, fused benzene blocks, isolated rings │
│ • Handle fused rings (naphthalene, anthracene) by │
│ propagating shared-edge patterns across components │
│ • Gives optimizer a good starting point; remaining locally │
│ inconsistent pairs are resolved later via Kekulé shifts │
│ • Broader atom set than aromatic detection (P, Se included) │
└────────────────────┬────────────────────────────────────────────┘
│
┌──────────┴─────────────┐
│ │
┌─────────▼────────────┐ ┌───────▼──────────────────────────────┐
│ 4a. Quick Mode │ │ 4b. Full Optimization │
│ • Lock metal bonds │ │ • Lock metal bonds at 1.0 │
│ • 3 iterations │ │ • Iterative BIDIRECTIONAL search: │
│ • Promote bonds │ │ - Test both +1 AND -1 changes │
│ where both atoms │ │ - Allows Kekulé structure swaps │
│ need increased │ │ • Score = f(valence_error, │
│ valence │ │ formal_charges, │
│ • Distance check │ │ electronegativity, │
│ │ │ conjugation deficit) │
│ │ │ • Optimizer choice: │
│ │ │ - Beam: parallel hypotheses │
│ │ │ - Greedy: single best change │
│ │ │ • Top-k edge candidate selection │
│ │ │ • Kekulé-shift fallback when single-│
│ │ │ edge moves stall: flips a whole │
│ │ │ alternating chain between two │
│ │ │ valence-deficient atoms at once │
│ │ │ (escapes fused-ring traps) │
└─────────┬────────────┘ └──────────┬───────────────────────────┘
└───────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 5. Aromatic Detection (Hückel 4n+2) │
│ • Per-ring: 5/6-membered rings with C/N/O/S/B │
│ - Count π electrons (sp² C → 1e, N/O/S LP → 2e, B → 0) │
│ - Lone pairs are counted in every ring an atom belongs to │
│ - Apply Hückel 4n+2; set ring bonds to 1.5 │
│ • Fused-perimeter fallback for non-alternant systems │
│ (azulene 5-7, acenaphthylene 5-6-6) where no individual │
│ ring is Hückel but the perimeter of the union is: │
│ promotes every bond inside the component to 1.5 │
│ • --kekule / kekule=True keeps Kekulé bond orders but still │
│ stores aromatic ring metadata │
│ • Other heteroatoms (e.g. P, Se) use Kekulé structures │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 6. Formal Charge Assignment │
│ • For each non-metal atom: │
│ - B = 2 × Σ(bond_orders, excluding metal coord) │
│ - target = min(8, 2 × V_electrons) │
│ (octet for C/N/O/F/..., sextet for B/Al/Ga — no │
│ spurious lone pair on trivalent boron) │
│ - L = max(0, target - B) │
│ - formal = V_electrons - L - B/2 │
│ • Balance total to match system charge │
│ • Metals assigned oxidation state as formal charge │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 7. Optional: Gasteiger Partial Charges │
│ • compute_gasteiger_charges(G, target_charge) │
│ • Convert bond orders to RDKit bond types │
│ • Compute Gasteiger charges │
│ • Adjust for total charge conservation │
│ • Aggregate H charges onto heavy atoms │
│ • Stored in G.nodes[i]["charges"]["gasteiger"] │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 8. Optional: Non-Covalent Interaction Detection (--nci) │
│ • Classify sites: donors, acceptors, ions, halogens, etc. │
│ • Detect pi-systems: aromatic rings + conjugated domains │
│ • Enumerate candidate pairs (graph-distance filtered) │
│ • Geometry checks: distances, angles, plane alignment │
│ • 17 interaction types (H-bond, pi-stack, sigma-hole, ...) │
│ • Stored in G.graph["ncis"] as list[NCIData] │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 9. Output Graph │
│ Nodes: symbol, formal_charge, valence, metal_valence, │
│ oxidation_state (metals only) │
│ Edges: bond_order, bond_type, metal_coord │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ 1. Input Processing |
│ • Parse XYZ file internally │
│ • Write XYZ to temporary directory │
│ • Set up xTB calculation parameters │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 2. Run xTB Calculation │
│ Command: xtb <file>.xyz --chrg <charge> --uhf <unpaired> │
│ • GFN2-xTB Hamiltonian │
│ • Single-point calculation │
│ • Wiberg bond order analysis │
│ • Mulliken population analysis │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 3. Parse xTB Output │
│ • Read wbo file (Wiberg bond orders) │
│ • Read charges file (Mulliken atomic charges) │
│ • Threshold: bond_order > 0.5 → create edge │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 4. Build Graph from xTB Data │
│ • Create nodes with Mulliken charges │
│ • Create edges with Wiberg bond orders │
│ • No further optimization needed │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 5. Cleanup (optional) │
│ • Remove temporary xTB files (unless --no-clean) │
└────────────────────┬────────────────────────────────────────────┘
│
┌────────────────────▼────────────────────────────────────────────┐
│ 6. Output Graph │
│ Nodes: symbol, charges{'mulliken': ...}, agg_charge, │
│ valence, metal_valence │
│ Edges: bond_order (Wiberg), bond_type, metal_coord │
└─────────────────────────────────────────────────────────────────┘
| Feature | cheminf (quick) | cheminf (full) | xtb |
|---|---|---|---|
| Speed | Fastest | Fast | Moderate |
| Accuracy | Connectivity only | Very good across various systems | Only limited by xTB performance (QM-based) |
| External deps | None | None | Requires xTB binary |
| Bond orders | None (all bonds set to 1) | Optimized formal charge and valency | Wiberg (fractional) |
| Charges | None | Gasteiger | Mulliken |
| Metal complexes | Detection only | Reasonable | Reasonable (limited by xTB metal performance) |
| Conjugated systems | Not detected | Excellent | Excellent |
| Best for | Topology screening, pre-processing | Most cases | Awkward bonding, validation |
Use --method cheminf (default):
- Most use cases
- No xTB installation available
- Batch processing structures
Use --method cheminf --quick:
- Extremely large molecules or batch pre-processing
- When only connectivity (which atoms are bonded) is needed
Use --method xtb:
- Validation of cheminf results
- Unusual electronic structures
- Low confidence in bonding structure
Beam Search Optimizer (--optimizer beam default, --beam-width 5 default):
- Explores multiple optimization paths in parallel
- Maintains top-k hypotheses at each iteration (of top candidates)
- Bidirectional: tests both +1 and -1 bond orders for each hypothesis
- Kekulé-shift fallback: when no single-edge move improves any beam, searches for an alternating-BO chain (1,2,1,2,…,1) between two valence-deficient atoms and flips the whole chain in one move. Escapes local-optimum traps in fused-ring systems where every single-edge change violates valence on a saturated neighbour (e.g. BN-doped PAHs, fused-ring carbanions). Only fires on stall so it can't pre-empt legitimate single-edge solutions (like promoting a double to a triple in benzyne)
- More robust against local minima
- Slower, but better convergence
- Best for robust bonding assignment across periodic table
Greedy Optimizer (--optimizer greedy):
- Tests all top candidate edges, picks single best change per iteration
- Bidirectional: tests both +1 and -1 bond order changes
- Fast and effective for most molecules
- Can get stuck in local minima (e.g. alpha, beta unsaturated systems)
> xyzgraph -h
usage: xyzgraph [-h] [--version] [--citation] [--method {cheminf,xtb}]
[--no-clean] [-c CHARGE] [-m MULTIPLICITY] [-q] [-k]
[--relaxed] [-t THRESHOLD] [-d] [-a] [--json]
[-as ASCII_SCALE] [--nci] [--stereo] [-H]
[--show-h-idx SHOW_H_IDX] [-b] [--frame FRAME] [--all-frames]
[--compare-rdkit] [--compare-rdkit-tm] [--orca-out ORCA_OUT]
[--orca-threshold ORCA_THRESHOLD] [-o {greedy,beam}]
[-bw BEAM_WIDTH] [--max-iter MAX_ITER]
[--edge-per-iter EDGE_PER_ITER] [--bond BOND]
[--unbond UNBOND] [--threshold-h-h THRESHOLD_H_H]
[--threshold-h-nonmetal THRESHOLD_H_NONMETAL]
[--threshold-h-metal THRESHOLD_H_METAL]
[--threshold-metal-ligand THRESHOLD_METAL_LIGAND]
[--threshold-sblock-ligand THRESHOLD_SBLOCK_LIGAND]
[--threshold-nonmetal THRESHOLD_NONMETAL]
[--allow-metal-metal-bonds]
[--threshold-metal-metal-self THRESHOLD_METAL_METAL_SELF]
[--period-scaling-h-bonds PERIOD_SCALING_H_BONDS]
[--period-scaling-nonmetal-bonds PERIOD_SCALING_NONMETAL_BONDS]
[--period-scaling-sblock-bonds PERIOD_SCALING_SBLOCK_BONDS]
[input_file]
Build molecular graph from XYZ or ORCA output.
positional arguments:
input_file Input file (XYZ or ORCA .out)
options:
-h, --help show this help message and exit
--version Print version and exit
--citation Print citation and exit
Common Options:
--method {cheminf,xtb}
Graph construction method (default: cheminf)
--no-clean Keep temporary xTB files (only for --method xtb)
-c, --charge CHARGE Total molecular charge (default: 0)
-m, --multiplicity MULTIPLICITY
Spin multiplicity (default: auto estimation)
-q, --quick Geometric detection only: skip bond order assignment,
formal charges, and aromatic detection
-k, --kekule Keep Kekule bond orders (do not convert aromatic rings
to 1.5)
--relaxed Relaxed geometric validation (for transition states)
-t, --threshold THRESHOLD
Global scaling for bond thresholds (default: 1.0)
Output Options:
-d, --debug Enable debug output
-a, --ascii Show 2D ASCII depiction
--json Output graph as JSON (for generating test fixtures)
-as, --ascii-scale ASCII_SCALE
ASCII scaling factor (default: 2.5)
--nci Detect and report non-covalent interactions
--stereo Assign and display stereochemistry labels
-H, --show-h Include hydrogens in visualizations
--show-h-idx SHOW_H_IDX
Show specific H atoms (comma-separated indices)
Input Options:
-b, --bohr XYZ file in Bohr units (default: Angstrom)
--frame FRAME Frame index for trajectory files, 0-indexed (default:
0)
--all-frames Process all frames in trajectory
Comparison Options:
--compare-rdkit Compare with RDKit graph
--compare-rdkit-tm Compare with RDKit xyz2mol_tm graph
--orca-out ORCA_OUT ORCA output file for comparison
--orca-threshold ORCA_THRESHOLD
Min Mayer bond order for ORCA (default: 0.25)
Optimizer Options:
-o, --optimizer {greedy,beam}
Algorithm (default: beam)
-bw, --beam-width BEAM_WIDTH
Beam width (default: 5)
--max-iter MAX_ITER Max iterations (default: 50)
--edge-per-iter EDGE_PER_ITER
Edges per iteration (default: 10)
Bond Constraints:
--bond BOND Force bonds (e.g., --bond 0,1 2,3)
--unbond UNBOND Prevent bonds (e.g., --unbond 0,1)
Advanced Thresholds:
--threshold-h-h THRESHOLD_H_H
H-H vdW threshold (default: 0.38)
--threshold-h-nonmetal THRESHOLD_H_NONMETAL
H-nonmetal vdW threshold (default: 0.42)
--threshold-h-metal THRESHOLD_H_METAL
H-metal vdW threshold (default: 0.45)
--threshold-metal-ligand THRESHOLD_METAL_LIGAND
D-block metal-ligand vdW threshold (default: 0.65)
--threshold-sblock-ligand THRESHOLD_SBLOCK_LIGAND
S-block metal-ligand vdW threshold (default: 0.55)
--threshold-nonmetal THRESHOLD_NONMETAL
Nonmetal-nonmetal vdW threshold (default: 0.55)
--allow-metal-metal-bonds
Allow metal-metal bonds (default: True)
--threshold-metal-metal-self THRESHOLD_METAL_METAL_SELF
Metal-metal vdW threshold (default: 0.7)
--period-scaling-h-bonds PERIOD_SCALING_H_BONDS
Period scaling for H bonds (default: 0.05)
--period-scaling-nonmetal-bonds PERIOD_SCALING_NONMETAL_BONDS
Period scaling for nonmetal bonds (default: 0.0)
--period-scaling-sblock-bonds PERIOD_SCALING_SBLOCK_BONDS
Period scaling for s-block M-L bonds (default: 0.05)
Method comparison:
xyzgraph molecule.xyz --debug > cheminf.txt
xyzgraph molecule.xyz --method xtb --debug > xtb.txt
diff cheminf.txt xtb.txtValidate against RDKit:
xyzgraph molecule.xyz --compare-rdkitDirect graph construction:
from xyzgraph import build_graph, graph_debug_report
# Cheminf full optimization
G_full = build_graph(
atoms='molecule.xyz',
charge=0,
max_iter=50, # maximum iterations (normally converged <20)
edge_per_iter=6, # default 10
bond=[(0,1)], # ensure a bond between 0 and 1
debug=True
)
# Keep Kekule bond orders (no 1.5 aromatic conversion)
G_kekule = build_graph('molecule.xyz', kekule=True)
# Aromatic rings are still detected and stored in G_kekule.graph["aromatic_rings"]
# Bond orders remain as optimised Kekule values (1.0/2.0)Stereochemistry assignment (R/S, E/Z, axial, planar, helical):
from xyzgraph import annotate_stereo
stereo = annotate_stereo(G_full) # assigns all stereo types at once
# Also stored as G.graph["stereo"] for downstream use (e.g. rendering)
# Each key maps to a list of dicts with "label" and type-specific atom references:
stereo["point"] # [{"atom": 5, "label": "R"}, ...] — R/S centres
stereo["ez"] # [{"bond": [4,6], "label": "E"}, ...] — geometric isomerism
stereo["axial"] # [{"atoms": [9,10], "label": "Rₐ"}, ...] — biaryl, allene, metallocene
stereo["planar"] # [{"ring": [...], "label": "Rₚ"}, ...] — metallocene, paracyclophane
stereo["helical"] # [{"atoms": [12,0], "label": "M"}, ...] — helicenesExample structures for each stereo type are provided in examples/stereo/:
| Type | Examples | Label |
|---|---|---|
| Point (R/S) | mnh.xyz |
R, S |
| E/Z | E_2butene.xyz, Z_2butene.xyz |
E, Z |
| Axial (biaryl) | R_binol.xyz, S_binol.xyz |
Rₐ, Sₐ |
| Axial (allene) | Ra_allene.xyz, Sa_allene.xyz |
Rₐ, Sₐ |
| Axial (metallocene) | Ra_ferrocene_axial.xyz, Sa_ferrocene_axial.xyz |
Rₐ, Sₐ |
| Axial (hindered biaryl) | Ra_hindered_biaryl.xyz, Sa_hindered_biaryl.xyz |
Rₐ, Sₐ |
| Planar (metallocene) | Rp_ferrocene.xyz, Sp_ferrocene.xyz |
Rₚ, Sₚ |
| Planar (paracyclophane) | 22paracyclophane.xyz, 22paracyclophane_F.xyz |
None, Sₚ |
| Helical | M_helicene.xyz, P_helicene.xyz |
M, P |
Conventions:
- R/S, E/Z, axial: standard CIP rules
- Planar (general): IUPAC pilot-atom convention (CW from pilot = Rₚ)
- Planar (metallocene): Schlögl convention (view from opposite the metal, CW = Rₚ). Note: this gives the opposite label from IUPAC CIP when the metal is the pilot atom
- Helical: IUPAC P/M helix convention
Known limitations:
- Axial chirality requires sp2 junction atoms with at least one in an aromatic ring; non-conjugated restricted rotations are not detected
- Ortho steric gating (≥2 non-H ortho substituents) may be overly permissive for some aryl-vinyl or aryl-amide axes where free rotation is possible in practice
- Planar chirality in paracyclophanes with chemically different bridges (e.g. -CH₂CH₂- vs -CH₂O-) may not be detected on the unsubstituted deck
- Helical chirality requires fused aromatic rings; large aza-helicenes with ring detection issues may not be assigned
xyzgraph includes a built-in ASCII renderer for 2D molecular structures. This is heavily inspired by work elsewhere, e.g. [5] by Andrew White.
from xyzgraph import graph_to_ascii
# Basic rendering
ascii_art = graph_to_ascii(G, scale=3.0, include_h=False)
print(ascii_art)Output example (acyl isothiouronium):
> xyzgraph examples/isothio.xyz -a
/C
/
///
C\
\\
\ \
\\
C\
//
//
O=======C
=========\
C---- \ /S\
// ---C \ / \\
// \ N---- /// \\ ----C\
C \ // ---C\ \C--- \
\ \ / \\ / \\\
\ C--- // \ / C
\ // ----C \\ / /
C--- // \ N\------C /
----C \ /// \\\ /
\ / \ ---C
C-------C/ \C----
//
C---- //
---C
\
\
\
C
Features:
- Single bonds:
-,|,/,\ - Double bonds:
=,‖(parallel lines) - Triple bonds:
# - Aromatic: 1.5 bond orders shown as single
- Special edges:
*(TS),.(NCI) ifG.edges[i,j]['TS']=TrueorG.edges[i,j]['NCI']=True
Compare methods by aligning their ASCII depictions:
from xyzgraph import build_graph, graph_to_ascii
# Build with both methods
G_cheminf = build_graph(atoms, method='cheminf')
G_xtb = build_graph(atoms, method='xtb')
# Generate aligned depictions
ascii_ref, layout = graph_to_ascii(G_cheminf)
ascii_xtb = graph_to_ascii(G_xtb, reference_layout=layout)
print("Cheminf:\n", ascii_ref)
print("\nxTB:\n", ascii_xtb)Tabular listing of all atoms and bonds:
from xyzgraph import graph_debug_report
report = graph_debug_report(G, include_h=False)
print(report)Full example:
> xyzgraph benzene_NH4-cation-pi.xyz -c 1 -a -d
================================================================================
XYZGRAPH
Molecular Graph Construction from Cartesian Coordinates
A. S. Goodfellow, 2025
================================================================================
Version: xyzgraph v1.5.0
Citation: A. S. Goodfellow, xyzgraph: Molecular Graph Construction from
Cartesian Coordinates, v1.5.0, 2025,
https://github.com/aligfellow/xyzgraph.git.
Input: benzene_NH4-cation-pi.xyz
Parameters: charge=1
================================================================================
# Building cheminf graph from examples/benzene_NH4-cation-pi.xyz...
================================================================================
BUILDING GRAPH (CHEMINF, FULL MODE)
Atoms: 17, Charge: 1, Multiplicity: 1
================================================================================
Added 17 atoms
Chemical formula: C6H10N
Step 1: Found 16 baseline bonds (using default thresholds)
...
...
...
Step 1: 16 baseline bonds added, 0 rejected
Found 1 rings from initial bonding (excluding metal cycles)
Total bonds in graph: 16
Initial bonds: 16
================================================================================
KEKULE INITIALIZATION FOR AROMATIC RINGS
================================================================================
Ring 0 (6-membered): ['C0', 'C1', 'C2', 'C3', 'C4', 'C5']
π electrons estimate: 6
--------------------------------------------------------------------------------
Valid rings for Kekulé initialization:
[0]
✓ Initialized isolated 6-ring 0
--------------------------------------------------------------------------------
SUMMARY: Initialized 1 ring(s) with Kekulé pattern
--------------------------------------------------------------------------------
================================================================================
BEAM SEARCH OPTIMIZATION (width=5)
================================================================================
Initial score: 22.50
Iteration 1:
No improvements found in any beam, stopping
Applying best solution to graph...
--------------------------------------------------------------------------------
Explored 13 states across 1 iterations
Found 0 improvements
Score: 22.50 → 22.50
--------------------------------------------------------------------------------
================================================================================
FORMAL CHARGE CALCULATION
================================================================================
Initial formal charges:
Sum: +1 (target: +1)
Charged atoms:
N12: +1
No residual charge distribution needed (sum matches target)
================================================================================
AROMATIC RING DETECTION (Hückel 4n+2)
================================================================================
Ring 1 (6-membered): ['C0', 'C1', 'C2', 'C3', 'C4', 'C5']
π electrons: 6 (C0:1, C1:1, C2:1, C3:1, C4:1, C5:1)
✓ AROMATIC (4n+2 rule: n=1)
--------------------------------------------------------------------------------
SUMMARY: 1 aromatic rings, 6 bonds set to 1.5
--------------------------------------------------------------------------------
================================================================================
GRAPH CONSTRUCTION COMPLETE
================================================================================
Constructed graph with chemical formula: C6H10N
================================================================================
# CHEMINF GRAPH DETAILS
================================================================================
# Molecular Graph: 17 atoms, 16 bonds
# total_charge=1 multiplicity=1
# (C-H hydrogens hidden; heteroatom-bound hydrogens shown; valences still include all H)
# [idx] Sym val=.. metal=.. formal=.. | neighbors: idx(order / aromatic flag)
# (val = organic valence excluding metal bonds; metal = metal coordination bonds)
[ 0] C val=4.00 metal=0.00 formal=0 | 1(1.50*) 5(1.50*)
[ 1] C val=4.00 metal=0.00 formal=0 | 0(1.50*) 2(1.50*)
[ 2] C val=4.00 metal=0.00 formal=0 | 1(1.50*) 3(1.50*)
[ 3] C val=4.00 metal=0.00 formal=0 | 2(1.50*) 4(1.50*)
[ 4] C val=4.00 metal=0.00 formal=0 | 3(1.50*) 5(1.50*)
[ 5] C val=4.00 metal=0.00 formal=0 | 0(1.50*) 4(1.50*)
[ 12] N val=4.00 metal=0.00 formal=+1 | 13(1.00) 14(1.00) 15(1.00) 16(1.00)
[ 13] H val=1.00 metal=0.00 formal=0 | 12(1.00)
[ 14] H val=1.00 metal=0.00 formal=0 | 12(1.00)
[ 15] H val=1.00 metal=0.00 formal=0 | 12(1.00)
[ 16] H val=1.00 metal=0.00 formal=0 | 12(1.00)
# Bonds (i-j: order) (filtered)
[ 0- 1]: 1.50
[ 0- 5]: 1.50
[ 1- 2]: 1.50
[ 2- 3]: 1.50
[ 3- 4]: 1.50
[ 4- 5]: 1.50
[12-13]: 1.00
[12-14]: 1.00
[12-15]: 1.00
[12-16]: 1.00
================================================================================
# ASCII Depiction (cheminf)
================================================================================
-C------------------------C-
--- ---
---- ----
--- ---
C\ -C
\\ //
\\\ ///
\\\ ///
\\ //
\C------------------------C/
H
|
|
|
|
H------------------------N-------------------------H
|
|
|
|
H
xyzgraph includes a geometry-based NCI detection module that identifies 17 types of non-covalent interactions from the molecular graph. NCI detection runs on top of the constructed graph and requires no additional dependencies.
xyzgraph molecule.xyz --nci --charge 1| Type | Description |
|---|---|
hbond |
Classical hydrogen bond (D-H...A) |
hbond_bifurcated |
Two donors sharing the same acceptor |
halogen_bond |
Sigma-hole bond via halogen (X...A) |
chalcogen_bond |
Sigma-hole bond via S, Se, Te |
pnictogen_bond |
Sigma-hole bond via P, As, Sb, Bi |
pi_pi_parallel |
Parallel-displaced pi-stacking (ring-ring) |
pi_pi_t_shaped |
T-shaped (edge-to-face) pi-stacking |
pi_pi_ring_domain |
Pi-stacking between ring and non-ring pi domain |
pi_pi_domain_domain |
Pi-stacking between two non-ring pi domains |
cation_pi |
Cation above aromatic ring |
anion_pi |
Anion above aromatic ring |
halogen_pi |
Halogen sigma-hole to pi-system |
ch_pi |
C-H...pi interaction |
hb_pi |
H-bond donor to pi-system |
cation_lp |
Cation to lone pair donor |
ionic |
Electrostatic cation-anion |
salt_bridge |
H-mediated ionic (cation-H...anion) |
from xyzgraph import build_graph
from xyzgraph.nci import detect_ncis, NCIThresholds
G = build_graph("molecule.xyz", charge=1)
ncis = detect_ncis(G)
for nci in ncis:
print(nci.type, nci.site_a, nci.site_b, nci.geometry)
# Results are also stored on the graph
ncis = G.graph["ncis"]
# Custom thresholds
thr = NCIThresholds(hb_da_max=3.0, pii_parallel_rmax=4.0)
ncis = detect_ncis(G, thresholds=thr)For trajectory analysis where topology is shared across frames, use NCIAnalyzer to avoid repeating site classification on every frame:
from xyzgraph.nci import NCIAnalyzer
analyzer = NCIAnalyzer(G) # topology work done once
for positions in trajectory_frames:
ncis = analyzer.detect(positions) # geometry checks onlyxyzgraph examples/isothio.xyz -c 1 --nci================================================================================
# Non-Covalent Interactions
================================================================================
1 interaction(s) detected:
chalcogen_bond S20 ... O0
================================================================================
# ASCII Depiction (with NCI dotted lines)
================================================================================
C\
\\ =C
\\ =======\\
\C= === \\ =O
=== \\ =======.
C= === .
=== .
| . -C\
C---- | S---- ----- \\
/ ----C | / ---C- \\
// \ N-- // | \C
/ \ / --- / | |
C \ / --C | |
\ \ / \ | |
\ C---- / \\ | |
\ / ---C \ ---C\ /C
\ // \ N---- \\ ///
C---- / \ / \\ //
----C \ // \C/
\ /
C--------C
/
//
C--- /
----C
\
\
\
\
C
The intramolecular S...O chalcogen bond is detected via the sigma-hole on S20 directed towards the carbonyl oxygen O0, shown as a dotted line in the ASCII depiction.
-
Metal Complexes
- Bond orders locked at 1.0 (no d-orbital chemistry)
- Metal-metal bonds partially supported but not well tested (single bond allowed)
- Can deal with both ionic and neutral ligands
-
Radicals & Open-Shell Systems
- Unlikely to appropriately solve a valence structure
- Not explicity dealt with currently
- May behave, may be unreliable
-
Zwitterions
- Formal charge and valence analysis does identify
-[N+](=O)(-[O-])bonding and formal charge pattern - This is performed without pattern matching
- May not always be fully robust, and does not account for delocalisation
- Formal charge and valence analysis does identify
-
Charged Aromatics
- Hückel electron counting is simplistic
- Should still solve with valence/charge optimisation
-
Inorganic Cages
- Homogeneous clusters (≥8 atoms, same element) bypass standard ring validation
- Unlikely to be fully accurately described, e.g. C/B cage structures
xyzgraph can directly compare its output to rdkit/xyz2mol [3], [4] or to rdkit/xyz2mol_tm [6], [7]:
xyzgraph molecule.xyz --compare-rdkit --debug
# or
xyzgraph molecule.xyz --compare-rdkit-tm --debug # integrates graph building from xyz2mol_tmOutput includes:
- Layout-aligned ASCII depictions
- Edge differences (bonds only in one method)
- Bond order differences (Δ ≥ 0.25)
Example:
# Bond differences: only_in_native=1 only_in_rdkit=0 bond_order_diffs=2
# only_in_native: 4-7
# bond_order_diffs (Δ≥0.25):
# 1-2 native=1.50 rdkit=1.00 Δ=+0.50
# 2-3 native=2.00 rdkit=1.50 Δ=+0.50
This section demonstrates xyzgraph's capabilities on real molecular systems, showcasing Kekulé initialization, aromatic detection, metal coordination analysis, and formal charge assignment.
This example demonstrates xyzgraph's handling of organometallic complexes with multiple ligand types.
System: [(η⁵-Cp)₂Fe][Mn(H)(CO)₂(PNN)] - Ferrocene cation with manganese hydride complex
File: examples/mnh.xyz (77 atoms)
Command:
xyzgraph examples/mnh.xyz --ascii --debugKey Features:
- Detection of Cp⁻ (cyclopentadienyl) rings coordinated to Fe
- Metal coordination summary (Fe²⁺, Mn¹⁺) with ligand classification
- Hydride ligand (H⁻) recognition
- Carbonyl (CO) ligands with triple-bonded oxygen
- Aromatic Cp rings with charge contribution to π system
Output (truncated):
================================================================================
KEKULE INITIALIZATION FOR AROMATIC RINGS
================================================================================
Ring 0 (6-membered): ['N6', 'C52', 'C58', 'C57', 'C55', 'C53']
π electrons estimate: 6
Ring 1 (5-membered): ['C7', 'C8', 'C9', 'C11', 'C13']
✓ Detected Cp-like ring (all 5 C bonded to Fe0)
π electrons estimate: 6
Ring 2 (5-membered): ['C15', 'C17', 'C19', 'C21', 'C23']
✓ Detected Cp-like ring (all 5 C bonded to Fe0)
π electrons estimate: 6
Ring 3 (6-membered): ['C25', 'C26', 'C28', 'C30', 'C32', 'C34']
π electrons estimate: 6
Ring 4 (6-membered): ['C36', 'C37', 'C39', 'C41', 'C43', 'C45']
π electrons estimate: 6
--------------------------------------------------------------------------------
Valid rings for Kekulé initialization:
[0, 1, 2, 3, 4]
✓ Cp-like 5-ring 1 initialized (rotation 0)
✓ Cp-like 5-ring 2 initialized (rotation 0)
✓ Initialized isolated 6-ring 0
✓ Initialized isolated 6-ring 3
✓ Initialized isolated 6-ring 4
--------------------------------------------------------------------------------
SUMMARY: Initialized 5 ring(s) with Kekulé pattern
--------------------------------------------------------------------------------
================================================================================
BEAM SEARCH OPTIMIZATION (width=5)
================================================================================
Locked 16 metal bonds
Initial score: 392.70
Iteration 1:
Generated 2 candidates, keeping top 2
✓ New best: O3-C64 Δtotal = 81.00 score = 311.70
Iteration 2:
Generated 4 candidates, keeping top 4
✓ New best: O4-C65 Δtotal = 81.00 score = 230.70
Iteration 3:
Generated 6 candidates, keeping top 5
✓ New best: O3-C64 Δtotal = 20.00 score = 210.70
Iteration 4:
Generated 5 candidates, keeping top 5
✓ New best: O4-C65 Δtotal = 20.00 score = 190.70
Iteration 5:
No improvements (single or Kekulé shift) found, stopping
Applying best solution to graph...
--------------------------------------------------------------------------------
Explored 198 states across 5 iterations
Found 4 improvements
Score: 392.70 → 190.70
--------------------------------------------------------------------------------
================================================================================
FORMAL CHARGE CALCULATION
================================================================================
Initial formal charges:
Sum: -3 (target: +0)
Metal coordination summary:
[ 0] Fe oxidation_state=+2 coordination=10
• 5-ring (-1) [donor: C13]
• 5-ring (-1) [donor: C19]
[ 1] Mn oxidation_state=+1 coordination=6
• H (-1) [donor: H67]
• CO ( 0) [donor: C64]
• CO ( 0) [donor: C65]
• N ( 0) [donor: N6]
• P ( 0) [donor: P2]
• N ( 0) [donor: N5]
Metal complex detected:
Residual: +3 (represents metal oxidation states)
Fe0: formal_charge=+2
Mn1: formal_charge=+1
================================================================================
AROMATIC RING DETECTION (Hückel 4n+2)
================================================================================
Ring 1 (6-membered): ['N6', 'C52', 'C58', 'C57', 'C55', 'C53']
π electrons: 6 (N6:1, C52:1, C58:1, C57:1, C55:1, C53:1)
✓ AROMATIC (4n+2 rule: n=1)
Ring 2 (5-membered): ['C7', 'C8', 'C9', 'C11', 'C13']
π electrons: 6 (C7:2(fc=-1), C8:1, C9:1, C11:1, C13:1)
✓ AROMATIC (4n+2 rule: n=1)
Ring 3 (5-membered): ['C15', 'C17', 'C19', 'C21', 'C23']
π electrons: 6 (C15:2(fc=-1), C17:1, C19:1, C21:1, C23:1)
✓ AROMATIC (4n+2 rule: n=1)
Ring 4 (6-membered): ['C25', 'C26', 'C28', 'C30', 'C32', 'C34']
π electrons: 6 (C25:1, C26:1, C28:1, C30:1, C32:1, C34:1)
✓ AROMATIC (4n+2 rule: n=1)
Ring 5 (6-membered): ['C36', 'C37', 'C39', 'C41', 'C43', 'C45']
π electrons: 6 (C36:1, C37:1, C39:1, C41:1, C43:1, C45:1)
✓ AROMATIC (4n+2 rule: n=1)
--------------------------------------------------------------------------------
SUMMARY: 5 aromatic rings, 28 bonds set to 1.5
--------------------------------------------------------------------------------
================================================================================
GRAPH CONSTRUCTION COMPLETE
================================================================================
### Selected atoms from molecular graph:
[ 0] Fe val=10.00 metal=0.00 formal=+2 | 7(1.00) 8(1.00) 9(1.00) 11(1.00) 13(1.00) 15(1.00) 17(1.00) 19(1.00) 21(1.00) 23(1.00)
[ 1] Mn val=6.00 metal=0.00 formal=+1 | 2(1.00) 5(1.00) 6(1.00) 64(1.00) 65(1.00) 67(1.00)
[ 3] O val=3.00 metal=0.00 formal=+1 | 64(3.00)
[ 4] O val=3.00 metal=0.00 formal=+1 | 65(3.00)
[ 8] C val=4.00 metal=1.00 formal=-1 | 0(1.00) 7(1.50*) 9(1.50*) 47(1.00)
[ 23] C val=4.00 metal=1.00 formal=-1 | 0(1.00) 15(1.50*) 21(1.50*)
[ 64] C val=3.00 metal=1.00 formal=-1 | 1(1.00) 3(3.00)
[ 65] C val=3.00 metal=1.00 formal=-1 | 1(1.00) 4(3.00)
[ 67] H val=0.00 metal=1.00 formal=-1 | 1(1.00)
ASCII Depiction:
Tip
Avert your eyes... Not good for complex molecular visualisation...
C---------C
/ \
/ \ C--
/ \ // ----
/ \ // --C
/ C // |
C / C |
\ / | |
\ / | |
\ / | O |
\ / | # C
C--------C |# //
\ #C-- //
\ // ---- //
\\ /C H --C C---------C C
C---- \ // \ / / \ /
/ \ -----C --P \ / C#####/ \ /
C----\--- /\ ---- \\ \ / // /####O \ /
|\\\ \ --//----C-- \\ \ /// / \ /
/| \ \ / ---\| \\ // ----N C---------N
C----- \\\ /--- | Mn---- \ / \
| ----Fe--- | | \ / \
| ---- /|\\ ----C | \ / \
C-- /| \\---| | \ / \
\\ // |---- \\| | C---------C \
\\/---| --C\ | / C
C-\\| ---- \\\ --N- //
C-- \\ ---- | --- /
\C-- | --- /
| | -C
| |
| |
| H
|
C
Analysis:
- Ferrocene fragment: Fe(II) coordinated to two Cp⁻ ligands (η⁵ coordination)
- Cp rings: Detected as aromatic with 6 π electrons (includes -1 charge contribution from each ring)
- Manganese center: Mn(I) with octahedral-like coordination
- Hydride (H⁻) ligand correctly identified (formal charge -1)
- Two CO ligands with C≡O triple bonds (formal charges: C: -1, O: +1), net neutral ligand
- Phosphine (P) and amine (N) dative bond donors
- Charge balance: System is neutral (Fe(II) + Mn(I) - 2×Cp⁻ - H⁻ = 0)
This example shows aromatic detection, formal charge assignment, and handling of heteroaromatic systems.
System: Acyl isothiouronium cation (quaternary nitrogen)
File: examples/isothio.xyz (52 atoms, +1 charge)
Command:
xyzgraph examples/isothio.xyz --charge 1 --ascii --debugKey Features:
- Benzene ring aromatic detection
- 5-membered heterocycle evaluation (thiazole-like ring)
- Formal charge on quaternary nitrogen (N⁺)
- Beam search optimization of carbonyl bond order
Output:
> xyzgraph examples/isothio.xyz -a -d -c 1
================================================================================
BUILDING GRAPH (CHEMINF, FULL MODE)
Atoms: 52, Charge: 1, Multiplicity: 1
================================================================================
Added 52 atoms
Chemical formula: C23H25N2OS
Step 1: Found 55 baseline bonds (using default thresholds)
Step 1: 55 baseline bonds added, 0 rejected
Found 4 rings from initial bonding (excluding metal cycles)
Total bonds in graph: 55
Initial bonds: 55
================================================================================
KEKULE INITIALIZATION FOR AROMATIC RINGS
================================================================================
Ring 0 (6-membered): ['N5', 'C6', 'C13', 'C17', 'N18', 'C19']
✗ Not planar
Ring 1 (6-membered): ['C7', 'C8', 'C9', 'C10', 'C11', 'C12']
π electrons estimate: 6
Ring 2 (5-membered): ['N18', 'C19', 'S20', 'C21', 'C26']
π electrons estimate: 7
✗ Hückel rule violated (π=7)
Ring 3 (6-membered): ['C21', 'C22', 'C23', 'C24', 'C25', 'C26']
π electrons estimate: 6
--------------------------------------------------------------------------------
Valid rings for Kekulé initialization:
[1, 3]
✓ Initialized isolated 6-ring 1
✓ Initialized isolated 6-ring 3
--------------------------------------------------------------------------------
SUMMARY: Initialized 2 ring(s) with Kekulé pattern
--------------------------------------------------------------------------------
================================================================================
BEAM SEARCH OPTIMIZATION (width=5)
================================================================================
Initial score: 489.00
Iteration 1:
Generated 5 candidates, keeping top 5
✓ New best: N18-C19 Δtotal = 116.50 score = 372.50
Iteration 2:
Generated 17 candidates, keeping top 5
✓ New best: C1-C2 Δtotal = 72.00 score = 300.50
Iteration 3:
Generated 3 candidates, keeping top 3
✓ New best: O0-C1 Δtotal = 71.00 score = 229.50
Iteration 4:
No improvements (single or Kekulé shift) found, stopping
Applying best solution to graph...
--------------------------------------------------------------------------------
Explored 152 states across 4 iterations
Found 3 improvements
Score: 489.00 → 229.50
--------------------------------------------------------------------------------
================================================================================
FORMAL CHARGE CALCULATION
================================================================================
Initial formal charges:
Sum: +1 (target: +1)
Charged atoms:
N18: +1
No residual charge distribution needed (sum matches target)
================================================================================
AROMATIC RING DETECTION (Hückel 4n+2)
================================================================================
Ring 1 (6-membered): ['N5', 'C6', 'C13', 'C17', 'N18', 'C19']
✗ Not planar, skipping aromaticity check
Ring 2 (6-membered): ['C7', 'C8', 'C9', 'C10', 'C11', 'C12']
π electrons: 6 (C7:1, C8:1, C9:1, C10:1, C11:1, C12:1)
✓ AROMATIC (4n+2 rule: n=1)
Ring 3 (5-membered): ['N18', 'C19', 'S20', 'C21', 'C26']
π electrons: 6 (N18:1(fc=+1), C19:1, S20:2(LP), C21:1, C26:1)
✓ AROMATIC (4n+2 rule: n=1)
Ring 4 (6-membered): ['C21', 'C22', 'C23', 'C24', 'C25', 'C26']
π electrons: 6 (C21:1, C22:1, C23:1, C24:1, C25:1, C26:1)
✓ AROMATIC (4n+2 rule: n=1)
--------------------------------------------------------------------------------
SUMMARY: 3 aromatic rings, 16 bonds set to 1.5
--------------------------------------------------------------------------------
================================================================================
GRAPH CONSTRUCTION COMPLETE
================================================================================
Constructed graph with chemical formula: C23H25N2OS
================================================================================
# CHEMINF GRAPH DETAILS
================================================================================
# Molecular Graph: 52 atoms, 55 bonds
# total_charge=1 multiplicity=1
# (C-H hydrogens hidden; heteroatom-bound hydrogens shown; valences still include all H)
# [idx] Sym val=.. metal=.. formal=.. | neighbors: idx(order / aromatic flag)
# (val = organic valence excluding metal bonds; metal = metal coordination bonds)
[ 0] O val=2.00 metal=0.00 formal=0 | 1(2.00)
[ 1] C val=4.00 metal=0.00 formal=0 | 0(2.00) 2(1.00) 5(1.00)
[ 2] C val=4.00 metal=0.00 formal=0 | 1(1.00) 3(2.00)
[ 3] C val=4.00 metal=0.00 formal=0 | 2(2.00) 4(1.00)
[ 4] C val=4.00 metal=0.00 formal=0 | 3(1.00)
[ 5] N val=3.00 metal=0.00 formal=0 | 1(1.00) 6(1.00) 19(1.00)
[ 6] C val=4.00 metal=0.00 formal=0 | 5(1.00) 7(1.00) 13(1.00)
[ 7] C val=4.00 metal=0.00 formal=0 | 6(1.00) 8(1.50*) 12(1.50*)
[ 8] C val=4.00 metal=0.00 formal=0 | 7(1.50*) 9(1.50*)
[ 9] C val=4.00 metal=0.00 formal=0 | 8(1.50*) 10(1.50*)
[ 10] C val=4.00 metal=0.00 formal=0 | 9(1.50*) 11(1.50*)
[ 11] C val=4.00 metal=0.00 formal=0 | 10(1.50*) 12(1.50*)
[ 12] C val=4.00 metal=0.00 formal=0 | 7(1.50*) 11(1.50*)
[ 13] C val=4.00 metal=0.00 formal=0 | 6(1.00) 14(1.00) 17(1.00)
[ 14] C val=4.00 metal=0.00 formal=0 | 13(1.00) 15(1.00) 16(1.00)
[ 15] C val=4.00 metal=0.00 formal=0 | 14(1.00)
[ 16] C val=4.00 metal=0.00 formal=0 | 14(1.00)
[ 17] C val=4.00 metal=0.00 formal=0 | 13(1.00) 18(1.00)
[ 18] N val=4.00 metal=0.00 formal=+1 | 17(1.00) 19(1.50*) 26(1.50*)
[ 19] C val=4.00 metal=0.00 formal=0 | 5(1.00) 18(1.50*) 20(1.50*)
[ 20] S val=3.00 metal=0.00 formal=0 | 19(1.50*) 21(1.50*)
[ 21] C val=4.50 metal=0.00 formal=0 | 20(1.50*) 22(1.50*) 26(1.50*)
[ 22] C val=4.00 metal=0.00 formal=0 | 21(1.50*) 23(1.50*)
[ 23] C val=4.00 metal=0.00 formal=0 | 22(1.50*) 24(1.50*)
[ 24] C val=4.00 metal=0.00 formal=0 | 23(1.50*) 25(1.50*)
[ 25] C val=4.00 metal=0.00 formal=0 | 24(1.50*) 26(1.50*)
[ 26] C val=4.50 metal=0.00 formal=0 | 18(1.50*) 21(1.50*) 25(1.50*)
# Bonds (i-j: order) (filtered)
[ 0- 1]: 2.00
[ 1- 2]: 1.00
[ 1- 5]: 1.00
[ 2- 3]: 2.00
[ 3- 4]: 1.00
[ 5- 6]: 1.00
[ 5-19]: 1.00
[ 6- 7]: 1.00
[ 6-13]: 1.00
[ 7- 8]: 1.50
[ 7-12]: 1.50
[ 8- 9]: 1.50
[ 9-10]: 1.50
[10-11]: 1.50
[11-12]: 1.50
[13-14]: 1.00
[13-17]: 1.00
[14-15]: 1.00
[14-16]: 1.00
[17-18]: 1.00
[18-19]: 1.50
[18-26]: 1.50
[19-20]: 1.50
[20-21]: 1.50
[21-22]: 1.50
[21-26]: 1.50
[22-23]: 1.50
[23-24]: 1.50
[24-25]: 1.50
[25-26]: 1.50
================================================================================
# ASCII Depiction (cheminf)
================================================================================
/C
/
///
C\
\\
\ \
\\
C\
//
//
O=======C
=========\
C---- \ /S\
// ---C \ / \\
// \ N---- /// \\ ----C\
C \ // ---C \C--- \
\ \ / \ / \\\
\ C--- // \ / C
\ // ----C \ / /
C--- // \ N-------C /
----C \ /// \\\ /
\ / \ ---C
C-------C/ \C----
//
C---- //
---C
\
\
\
C
Analysis:
- Benzene rings: Two rings correctly identified as aromatic (bond order 1.5)
- 5-membered heterocycle: N-C-S-C-C ring retains Kekulé structure with N=C double bond
- Quaternary nitrogen: N16 assigned +1 formal charge (4 bonds, no lone pairs)
- a,b-unsaturated: O=C and C=C double bonds correctly optimized
xyzgraph uses distance-based bond detection with thresholds derived from van der Waals (vdW) radii by Charry and Tkatchenko [1]. By default, these thresholds are calibrated for different atom pair types:
| Atom Pair Type | Default Threshold | Parameter Name |
|---|---|---|
| H-H | 0.38 × (r₁ + r₂) | threshold_h_h |
| H-nonmetal | 0.42 × (r₁ + r₂) | threshold_h_nonmetal |
| H-metal | 0.45 × (r₁ + r₂) | threshold_h_metal |
| S-block metal-ligand | 0.55 × (r₁ + r₂) | threshold_sblock_ligand |
| D-block metal-ligand | 0.65 × (r₁ + r₂) | threshold_metal_ligand |
| Nonmetal-nonmetal | 0.55 × (r₁ + r₂) | threshold_nonmetal_nonmetal |
| Metal-Metal (same type) | 0.7 × (2r) | threshold_metal_metal_self |
Where r₁ and r₂ are the VDW radii of the two atoms. Period-dependent scaling is applied for H-bonds (period_scaling_h_bonds=0.05) and s-block metal-ligand bonds (period_scaling_sblock_bonds=0.05) to account for heavier elements bonding at longer distances.
The two-step construction allows detection of elongated bonds in transition state structures by adjusting the global threshold:
# Detect elongated bonds in TS structures
xyzgraph ts_structure.xyz --threshold 1.2 --debug
# For more dense connectivity, one can use relaxed mode (more permissive geometric validation)
xyzgraph structure.xyz --threshold 1.2 --relaxed --debugRecommended threshold ranges:
- 1.0 (default): Ground-state structures
- 1.1-1.2: Slightly elongated bonds
- 1.2-1.3: Transition states with stretched geometries
- ≥1.35: Unstable - spurious bonding likely
The two-step construction with geometric validation helps reject spurious diagonals even at higher thresholds. The --relaxed flag can be used for more permissive angle and diagonal thresholds (but note: this is likely to produce spurious structures).
Example workflow: See vib_analysis for a complete workflow analyzing transition state vibrational modes using xyzgraph connectivity.
Global Scaling:
- The
--threshold(orthresholdin Python) parameter provides a simple way to globally scale all thresholds. - This is safer than modifying individual thresholds.
- e.g.
--threshold 1.1- threshold_h_nonmetal × (r₁ + r₂) × 1.1
Individual Scaling:
These parameters are exposed for users who need to:
- Handle unusual bonding situations not covered by defaults
- Specifically wish to obtain dense connectivity
- Fine-tune bond detection for specific molecular systems
- Debug or validate bond detection behavior
Can be performed using the cli e.g. --threshold_h_nonmetal 0.5 or directly in python within build_graph(threshold_h_nonmetal=0.5)
Warning
Modifying these thresholds is not recommended unless you have a specific reason and understand the implications
Changing values can produce chemically invalid structures
-
van der Waals Radii: Jorge Charry and Alexandre Tkatchenko, J. Chem. Theory Comput., 2024, 20, 7469–7478. DOI.
-
xTB (Extended Tight Binding): Christoph Bannwarth, Sebastian Ehlert, and Stefan Grimme, J. Chem. Theory Comput. 2019, 15, 1652–1671. DOI. Repo.
-
xyz2mol: Jan Jensen et al., xyz2mol. Now integrated into RDKit as
Chem.rdDetermineBonds.DetermineBonds(). See also Y. Kim, W. Y. Kim, Bull. Korean Chem. Soc., 2015, 36, 1769–1777. -
RDKit: RDKit: Open-source cheminformatics. https://www.rdkit.org. Repo.
-
moltext: A. White, moltext. Repo
-
xyz2mol_tm: Jan Jensen et al., xyz2mol_tm. See also ref 7..
-
SMILES all around: structure to SMILES conversion for transition metal complexes: Maria H. Rasmussen, Magnus Strandgaard, Julius Seumer, Laura K. Hemmingsen, Angelo Frei, David Balcells and Jan H. Jensen, Journal of Cheminformatics, 2025, 17. DOI.
- James O'Brien (@JamesOBrien2) — stereochemistry detection
Contributions welcome! Please open an issue or pull request and get in touch with any questions here.
To develop with xyzgraph, you can clone the repo and use
just and uv to setup the dev environment:
just setupThe cli can be used with:
uv run xyzgraph filename.xyzRun the checks using:
just check