Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 14 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,11 @@ To add MMseqs2 MSAs and templates to the AlphaFold3 input JSON, you can use the
To run the script with templates, use the following command:

```bash
python add_mmseqs_msa.py --input_json <input_json> --output_json <output_json> --templates --num_templates <num_templates>
python add_mmseqs_msa.py --input_json <input_json> ... --output_json <output_json> ... --templates --num_templates <num_templates>
```

- `<input_json>`: Path to the input AlphaFold3 JSON file.
- `<output_json>`: [optional] Path to the output JSON file (default: `<input_json_stem>`_mmseqs.json).
- `<input_json>`: Path to the input AlphaFold3 JSON file. Can be a single file or multiple files separated by spaces.
- `<output_json>`: [optional] Path to the output JSON file (default: `<input_json_stem>`_mmseqs.json). Can be a single file or multiple files separated by spaces.
- `<num_templates>`: [optional] The number of templates to use (default: 20)


Expand All @@ -61,11 +61,11 @@ python add_mmseqs_msa.py --input_json <input_json> --output_json <output_json> -
To run the script without templates, use the following command:

```bash
python add_mmseqs_msa.py --input_json <input_json> --output_json <output_json>
python add_mmseqs_msa.py --input_json <input_json> ... --output_json <output_json> ...
```

- `<input_json>`: Path to the input AlphaFold3 JSON file.
- `<output_json>`: [optional] Path to the output JSON file (default: `<input_json_stem>`_mmseqs.json).
- `<input_json>`: Path to the input AlphaFold3 JSON file. Can be a single file or multiple files separated by spaces.
- `<output_json>`: [optional] Path to the output JSON file (default: `<input_json_stem>`_mmseqs.json). Can be a single file or multiple files separated by spaces.


### Adding custom templates
Expand All @@ -89,19 +89,21 @@ python add_custom_template.py --input_json <input_json> --output_json <output_js

#### add_mmseqs_msa.py

If you wish to add a custom template and generate an MMseqs2 MSA/templates, you can use `add_mmseqs_msa.py`:
If you wish to add a custom template and generate an MMseqs2 MSA/templates, you can use `add_mmseqs_msa.py`:

```bash
python add_mmseqs_msa.py --input_json <input_json> --output_json <output_json> --templates --num_templates <num_templates> --custom_template <custom_template> --custom_template_chain <custom_template_chain> --target_id <target_id>
```

- `<input_json>`: Path to the input AlphaFold3 JSON file.
- `<input_json>`: Path to the input AlphaFold3 JSON file.
- `<output_json>`: [optional] Path to the output JSON file (default: `<input_json_stem>`_mmseqs.json).
- `<num_templates>`: [optional] The number of templates to use (default: 20)
- `<custom_template>` : Path to the custom template file in mmCIF format.
- `<custom_template_chain>` : [conditionally required] The chain ID of the chain to use in your custom template, only required if using a multi-chain template.
- `<target_id>` : [conditionally required] The ID of the sequence the custom template relates to, only required if modelling a complex.

Note: You cannot use `--custom_template` with multiple input JSON files.

### Running AlphaFold3

#### alphafold3.py
Expand All @@ -110,16 +112,16 @@ This file has the functionality of the above two scripts above and runs alphafol
you have the AlphaFold3 on your system (Instructions [here](https://github.com/google-deepmind/alphafold3/blob/main/docs/installation.md) and have procured the model parameters and the databases.

```bash
python alphafold3.py <input_json> <output_dir> --model_params <model_params> --database <database>
--mmseqs2 --num_templates <num_templates> --custom_template <custom_template> --custom_template_chain <custom_template_chain> --target_id <target_id>
python alphafold3.py <input_json> ... <output_dir> --output_json <output_json> ... --model_params <model_params> --database <database> --mmseqs2 --num_templates <num_templates> --custom_template <custom_template> --custom_template_chain <custom_template_chain> --target_id <target_id>
```

- `<input_json>`: Path to the input AlphaFold3 JSON file.
- `<input_json>`: Path to the input AlphaFold3 JSON file. Can be a single file or multiple files separated by spaces.
- `<output_dir>`: Path to the output directory.
- `<output_json>`: [optional] Path to the output JSON file (default: `<input_json_stem>`_mmseqs.json). Can be a single file or multiple files separated by spaces.
- `<model_params>`: Path to the directory containing the AlphaFold3 model parameters.
- `<database>`: Path to the directory containing the databases #Note: This is not used if using the `--mmseqs2` flag but I think it is required by the alphafold3.py script.
- `<num_templates>`: [optional] The number of templates to use (default: 20)
- `<custom_template>` :[optional] Path to the custom template file in mmCIF format.
- `<custom_template>` :[optional] Path to the custom template file in mmCIF format. Note: You cannot use `--custom_template` with multiple input JSON files.
- `<custom_template_chain>` : [conditionally required] The chain ID of the chain to use in your custom template, only required if using a multi-chain template.
- `<target_id>` : [conditionally required] The ID of the sequence the custom template relates to, only required if modelling a complex.

Expand Down
3 changes: 1 addition & 2 deletions add_custom_template.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import os
import json
from io import StringIO

from af3_script_utils import (
custom_template_argpase_util,
Expand All @@ -24,7 +23,7 @@ def run_custom_template(
raise FileNotFoundError(msg)

for sequence in af3_json["sequences"]:
if not "protein" in sequence:
if "protein" not in sequence:
continue

sequence = get_custom_template(
Expand Down
39 changes: 26 additions & 13 deletions add_mmseqs_msa.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,21 +331,34 @@ def fetch_mmcif(
parser = argparse.ArgumentParser(
description="Add MMseqs2 unpaired MSA to AlphaFold3 json"
)
parser.add_argument("--input_json", help="Input alphafold3 json file")
parser.add_argument("--output_json", help="Output alphafold3 json file")
parser.add_argument("--input_json", help="Input alphafold3 json file", nargs="+")
parser.add_argument("--output_json", help="Output alphafold3 json file", nargs="+")

parser = mmseqs2_argparse_util(parser)
parser = custom_template_argpase_util(parser)

args = parser.parse_args()

add_msa_to_json(
args.input_json,
args.templates,
args.num_templates,
args.custom_template,
args.custom_template_chain,
args.target_id,
output_json=args.output_json,
to_file=True,
)
if args.output_json and len(args.input_json) != len(args.output_json):
msg = "If output_json is specified, the number of output json files must \
match the number of input json files"
raise ValueError(msg)

if len(args.input_json) > 1 and args.custom_template:
msg = "Multiple input json files found. This is not supported with custom \
template. Please run custom template separately for each input json file"
raise ValueError(msg)

if not args.output_json:
args.output_json = [None] * len(args.input_json)

for i, json_file in enumerate(args.input_json):
add_msa_to_json(
json_file,
args.templates,
args.num_templates,
args.custom_template,
args.custom_template_chain,
args.target_id,
output_json=args.output_json[i],
to_file=True,
)
1 change: 0 additions & 1 deletion af3_script_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

import os
import time
import json


def check_chains(mmcif_file):
Expand Down
82 changes: 46 additions & 36 deletions alphafold3.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,8 @@
# start by finding the directory where the alphafold3.py script is located

from add_custom_template import get_custom_template, custom_template_argpase_util
from add_custom_template import custom_template_argpase_util
from add_mmseqs_msa import mmseqs2_argparse_util, add_msa_to_json
import json
import os
from pathlib import Path
import subprocess
import re


def run_alphafold3(
Expand All @@ -15,7 +11,6 @@ def run_alphafold3(
model_params: str | Path,
database_dir: str | Path,
) -> None:
print(input_json)
input_json = Path(input_json)
output_dir = Path(output_dir)
cmd = rf"""
Expand Down Expand Up @@ -49,10 +44,10 @@ def run_alphafold3(


def af3_argparse_main(parser):
parser.add_argument("input_json", help="Input sequence file")
parser.add_argument("input_json", help="Input sequence file", nargs="+")

parser.add_argument("output_dir", help="Output directory")
parser.add_argument("--output_json", help="Output json file")
parser.add_argument("--output_json", help="Output json file", nargs="+")
# make the vartible saved as database_dir
parser.add_argument(
"--database",
Expand Down Expand Up @@ -87,32 +82,47 @@ def af3_argparse_main(parser):

args = parser.parse_args()

with open(args.input_json, "r") as f:
af3_json = json.load(f)

if args.mmseqs2:
af3_json = add_msa_to_json(
input_json=args.input_json,
templates=args.templates,
num_templates=args.num_templates,
custom_template=args.custom_template,
custom_template_chain=args.custom_template_chain,
target_id=args.target_id,
af3_json=af3_json,
output_json=args.output_json,
to_file=True,
)

output_json = (
args.input_json.replace(".json", "_mmseqs.json")
if args.output_json is None
else args.output_json
if args.output_json and len(args.input_json) != len(args.output_json):
msg = "If output_json is specified, the number of output json files must \
match the number of input json files"
raise ValueError(msg)

if len(args.input_json) > 1 and args.custom_template:
msg = "Multiple input json files found. This is not supported with custom \
template. Please run custom template separately for each input json file"
raise ValueError(msg)

if not args.output_json:
args.output_json = [None] * len(args.input_json)

for i, json_file in enumerate(args.input_json):
with open(json_file, "r") as f:
af3_json = json.load(f)

if args.mmseqs2:
af3_json = add_msa_to_json(
input_json=json_file,
templates=args.templates,
num_templates=args.num_templates,
custom_template=args.custom_template,
custom_template_chain=args.custom_template_chain,
target_id=args.target_id,
af3_json=af3_json,
output_json=args.output_json[i],
to_file=True,
)

run_json = (
json_file.replace(".json", "_mmseqs.json")
if args.output_json[i] is None
else args.output_json[i]
)
else:
run_json = json_file

run_alphafold3(
input_json=run_json,
output_dir=args.output_dir,
model_params=args.model_params,
database_dir=args.database_dir,
)
else:
output_json = args.input_json
run_alphafold3(
input_json=output_json,
output_dir=args.output_dir,
model_params=args.model_params,
database_dir=args.database_dir,
)