According to the paper, the candidate set should be consist of all reactants in the entrie USPTO database.
However, for the function "check_molecule_dict()" in https://github.com/hankook/RetCL/blob/main/datasets/__init__.py, i found something different.
def check_molecule_dict(mol_dict, datasets):
for split in ['train', 'val', 'test']:
for rxn in datasets[split]:
assert rxn.product in mol_dict
for reactant in rxn.reactants:
assert reactant in mol_dict
This function seems to be quite important. The training and evaluation sciprt cannot work without passing this check function.
According to the code, should the products also be inclulded in the candidate set?
It would be great if you could make the code for how to get the candidate set public.
According to the paper, the candidate set should be consist of all reactants in the entrie USPTO database.
However, for the function "check_molecule_dict()" in https://github.com/hankook/RetCL/blob/main/datasets/__init__.py, i found something different.
This function seems to be quite important. The training and evaluation sciprt cannot work without passing this check function.
According to the code, should the products also be inclulded in the candidate set?
It would be great if you could make the code for how to get the candidate set public.