This repository contains benchmark structures and results from the recent OpenBind EV-A71 2A benchmark. Full data are available from OpenBind; these structures have been selected to benchmark RBFE workflows.
This repository contains two benchmark release folders:
openbind_ev71_2a_pyrrolidine_benchmark_release/: focused 32-compound pyrrolidine-thiopyrimidine subset.openbind_ev71_2a_methylthio_pyrimidine_benchmark_release/: full 76-compound methylthio-pyrimidine benchmark set.
Each folder has the same structure.
Contains the prepared EV-A71 2A protease receptor used for docking/FEP setup.
ev71_2a_x7339a_template_prepared.pdb: prepared receptor template derived from theA71EV2A-x7339aholo structure.
Contains ligand pose files in SDF format.
*_rowan_docked_poses.sdf: Rowan analogue-docked top-ranked poses, one pose per compound.*_xtal_poses.sdf: representative crystallographic poses, one pose per compound. Use this file to run FEP from crystal poses.*_xtal_all_crystal_instances.sdf: all available crystallographic ligand instances for the selected compounds, including duplicate crystal structures where present.
Contains compound identity and affinity metadata.
*_subset.csv: compound IDs, SMILES, experimental affinities, source crystal structures, and source ligand reference files.*_subset.smi: SMILES and compound IDs, suitable for docking or cheminformatics workflows.
Contains Rowan FEP results (to date). These results are provided as a baseline and are not yet "good."
rowan_results_per_compound_long.csv: one row per compound per Rowan run.rowan_results_per_compound_wide.csv: one row per compound, with prediction columns for each Rowan run.rowan_results_per_edge_long.csv: one row per graph edge per Rowan run.rowan_results_per_edge_wide.csv: one row per markdown
Each release folder contains a self-contained benchmark subset with receptor coordinates, ligand poses, compound metadata, and Rowan RBFE outputs.
Prepared protein structure used as the receptor template for the benchmark.
ev71_2a_x7339a_template_prepared.pdb: prepared EV-A71 2A protease receptor, derived from theA71EV2A-x7339aholo structure.
Ligand pose files in SDF format.
*_rowan_docked_poses.sdf: Rowan analogue-docked top-ranked pose for each compound.*_xtal_poses.sdf: one representative crystallographic ligand pose per compound. Use this file for FEP graphs starting from crystal poses.*_xtal_all_crystal_instances.sdf: all available crystallographic ligand instances for the selected compounds, including duplicate crystal structures where present.
Compound-level metadata for the release subset.
*_subset.csv: compound IDs, SMILES, experimental affinity values, source crystal structure paths, and duplicate/crystal-instance mappings.*_subset.smi: SMILES and compound ID, suitable for docking or cheminformatics workflows.
Processed Rowan RBFE outputs.
rowan_results_per_compound_long.csv: one row per compound per Rowan run.rowan_results_per_compound_wide.csv: one row per compound, with prediction columns for each Rowan run.rowan_results_per_edge_long.csv: one row per RBFE edge per Rowan run.rowan_results_per_edge_wide.csv: one row per RBFE edge, with columns for each Rowan run.rowan_results_overall_by_run.csv: aggregate performance metrics by run, including RMSE, MAE, rank correlations, and linear-fit statistics.
Code in this repository is licensed under the MIT license.
The original OpenBind crystallographic and binding-affinity data were obtained from Zenodo and are licensed under the CC0 license.
Corin Wagen, with assistance from GPT 5.5