Skip to content

missing SPRINGDB/70negpos.db_cs219.ffdata file #3

@frihaka

Description

@frihaka

Hi,

I am trying to run PEPPI on a local machine (linux), using 3 protein sequences as a test run.

I have installed/compiled from source psipred (psipred.4.02.tar.gz) and blast (blast-2.2.9-amd64-linux.tar.gz) as recommended.
I have installed/compiled from source latest hh-suite (https://github.com/soedinglab/hh-suite) and its dependencies, including its latest pdb70 database (https://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/).

I have downloaded and compiled PEPPI as in the install.sh, with the following config:

Are you on a slurm HPC system? (WARNING: PEPPI will run slowly without HPC parallelization) [y/n] n

Full path of where you wish to install PEPPI: /home/hae/anaconda3/envs/peppi

Full path to your HHsuite installation: /home/hae/bin/HHSuite3/hh-suite/build

Full path to the database used for hhblits: /home/hae/bin/HHSuite3/pdbDB/pdb70

Full path of your python interpreter: /home/hae/anaconda3/envs/peppi/bin/python

What is your C++ compiler? g++

What is your fortran compiler? gfortran

The working directory - where the main pipeline script is launched - has the following tree:

├── A.fasta
├── B.fasta
├── LICENSE
├── PEPPI1.pl
├── PEPPIcontainer
│   ├── PEPPIconda.yml
│   └── PEPPIcontainer.def
├── README.md
├── bin
│   ├── CTNN
│   ├── CTmod
│   ├── CTpred.py
│   ├── NWalign
│   ├── PEPPI2temp.pl
│   ├── PEPPI3temp.py
│   ├── PRISMmod
│   ├── SEQmod
│   ├── SPRINGNEGmod
│   ├── SPRINGmod
│   ├── STRINGmod
│   ├── TMalign
│   ├── blastp
│   ├── charge_inp.dat
│   ├── compileRes.sh
│   ├── compiled_source
│   ├── dcomplex
│   ├── dimMap
│   ├── fort.21_alla
│   ├── getHashcode.py
│   ├── install.sh
│   ├── makeHHR.pl
│   ├── model_multiD
│   ├── multiwrapper.pl
│   ├── oldcomplex
│   ├── runSetWrapper.pl
│   ├── seqSearch.pl
│   ├── trainCT.py
│   └── trainDists.py
├── cmd.sh
├── install.sh
├── lib
│   ├── CTtrainvecs.txt
│   ├── SEQ
│   ├── SPRINGDB
│   ├── STRING
│   └── trainNB.txt
└── test
    ├── A.fasta
    ├── B.fasta
    ├── LR.csv
    ├── PEPPI2.pl
    ├── PEPPI3.py
    ├── PPI
    ├── allres.txt
    ├── mono
    ├── protcodeA.csv
    └── protcodeB.csv

There is no SPRINGDB/70negpos.db_cs219.ffdata file to be found from the original install/download of PEPPI:

├── 70CDHITstruct.txt
├── 70negpos.db
├── 70negpos.mono
├── 70negs.txt
├── monomers
└── monomers.aliases

I have checked that the scripts bin/makeHHR.pl and bin/seqSearch.pl had the correct local paths.

When I launch the main script:

PEPPI1.pl

the pipeline seems to run fine at the beginning (hh-suite functions kicking in as expected), with the output directory and its content as this:

├── PEPPI
│   ├── A.fasta
│   ├── B.fasta
│   ├── PEPPI2.pl
│   ├── mono
│   ├── protcodeA.csv
│   └── protcodeB.csv

But the pipeline fails to find "SPRINGDB/70negpos.db_cs219.ffdata", thus failing to output final results file:

prot1
HHR
- 12:32:53.429 INFO: Search results will be written to /tmp/hae/makeHHR_prot1_464127/prot1.hhr

- 12:32:53.456 INFO: Searching 92111 column state sequences.

- 12:32:53.501 INFO: /tmp/hae/makeHHR_prot1_464127/prot1.fasta is in A2M, A3M or FASTA format

- 12:32:53.501 INFO: Iteration 1

- 12:32:53.666 INFO: Prefiltering database

- 12:32:54.078 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 4748

- 12:32:54.126 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 2755

- 12:32:54.126 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2755

- 12:32:54.126 INFO: Scoring 2755 HMMs using HMM-HMM Viterbi alignment

- 12:32:54.230 INFO: Alternative alignment: 0

- 12:32:55.312 INFO: 2000 alignments done

- 12:32:55.831 INFO: 2755 alignments done

- 12:32:55.833 INFO: Alternative alignment: 1

- 12:32:57.405 INFO: 2650 alignments done

- 12:32:57.410 INFO: Alternative alignment: 2

- 12:32:57.914 INFO: 467 alignments done

- 12:32:57.914 INFO: Alternative alignment: 3

- 12:32:58.132 INFO: 76 alignments done

- 12:32:58.924 INFO: Premerge done

- 12:32:58.924 INFO: Realigning 500 HMM-HMM alignments using Maximum Accuracy algorithm

- 12:34:56.226 INFO: 1284 sequences belonging to 1284 database HMMs found with an E-value < 0.001

- 12:34:56.226 INFO: Number of effective sequences of resulting query HMM: Neff = 11.2888

- 12:34:56.239 INFO: Iteration 2

- 12:34:56.239 INFO: Set premerge to 0! (premerge: 3 iteration: 2 hits.Size: 1281)

- 12:34:56.407 INFO: Prefiltering database

- 12:34:56.820 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 4871

- 12:34:56.863 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 2556

- 12:34:56.863 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 1389

- 12:34:56.863 INFO: Scoring 1389 HMMs using HMM-HMM Viterbi alignment

- 12:34:56.950 INFO: Alternative alignment: 0

- 12:34:57.865 INFO: 1389 alignments done

- 12:34:57.868 INFO: Alternative alignment: 1

- 12:34:58.706 INFO: 1228 alignments done

- 12:34:58.708 INFO: Alternative alignment: 2

- 12:34:58.929 INFO: 79 alignments done

- 12:34:58.930 INFO: Alternative alignment: 3

- 12:34:59.108 INFO: 31 alignments done

- 12:34:59.156 INFO: Rescoring previously found HMMs with Viterbi algorithm

- 12:34:59.233 INFO: Alternative alignment: 0

- 12:34:59.744 INFO: 1167 alignments done

- 12:34:59.747 INFO: Alternative alignment: 1

- 12:35:00.282 INFO: 1167 alignments done

- 12:35:00.285 INFO: Alternative alignment: 2

- 12:35:00.456 INFO: 196 alignments done

- 12:35:00.456 INFO: Alternative alignment: 3

- 12:35:00.500 INFO: 32 alignments done

- 12:35:00.571 INFO: Realigning 500 HMM-HMM alignments using Maximum Accuracy algorithm

- 12:35:02.405 INFO: 1284 sequences belonging to 1284 database HMMs found with an E-value < 0.001

- 12:35:02.405 INFO: Number of effective sequences of resulting query HMM: Neff = 11.2888

$ cp /tmp/hae/makeHHR_prot1_464127/prot1.a3m /tmp/2YVboks0ez/9UFyIA7YHI.1.in.a3m
Filtering alignment to diversity 7 ...
$ hhfilter -v 1 -neff 7 -i /tmp/2YVboks0ez/9UFyIA7YHI.in.a3m -o /tmp/2YVboks0ez/9UFyIA7YHI.in.a3m
$ /home/hae/bin/HHSuite3/hh-suite/build/scripts/reformat.pl -v 1 -r -noss a3m psi /tmp/2YVboks0ez/9UFyIA7YHI.in.a3m /tmp/2YVboks0ez/9UFyIA7YHI.in.psi
Predicting secondary structure with PSIPRED ... $ /home/hae/bin/HHSuite3/BLAST//blastpgp -b 1 -j 1 -h 0.001 -d /home/hae/bin/HHSuite3/hh-suite/build/data/do_not_delete -i /tmp/2YVboks0ez/9UFyIA7YHI.sq -B /tmp/2YVboks0ez/9UFyIA7YHI.in.psi -C /tmp/2YVboks0ez/9UFyIA7YHI.chk 1> /tmp/2YVboks0ez/9UFyIA7YHI.blalog 2> /tmp/2YVboks0ez/9UFyIA7YHI.blalog
$ echo 9UFyIA7YHI.chk > /tmp/2YVboks0ez/9UFyIA7YHI.pn

$ echo 9UFyIA7YHI.sq  > /tmp/2YVboks0ez/9UFyIA7YHI.sn

$ /home/hae/bin/HHSuite3/BLAST//makemat -P /tmp/2YVboks0ez/9UFyIA7YHI
$ /home/hae/bin/HHSuite3/PSIPRED/psipred/bin/psipred /tmp/2YVboks0ez/9UFyIA7YHI.mtx /home/hae/bin/HHSuite3/PSIPRED/psipred/data/weights.dat /home/hae/bin/HHSuite3/PSIPRED/psipred/data/weights.dat2 /home/hae/bin/HHSuite3/PSIPRED/psipred/data/weights.dat3 > /tmp/2YVboks0ez/9UFyIA7YHI.ss
$ /home/hae/bin/HHSuite3/PSIPRED/psipred/bin/psipass2 /home/hae/bin/HHSuite3/PSIPRED/psipred/data/weights_p2.dat 1 0.98 1.09 /tmp/2YVboks0ez/9UFyIA7YHI.ss2 /tmp/2YVboks0ez/9UFyIA7YHI.ss > /tmp/2YVboks0ez/9UFyIA7YHI.horiz
done 
- 12:35:03.826 INFO: /tmp/hae/makeHHR_prot1_464127/prot1.a3m is in A2M, A3M or FASTA format

- 12:35:03.847 WARNING: MSA prot1 looks too diverse (Neff=12.227>11). Better check it with an alignment viewer for non-homologous segments. Also consider building the MSA with hhblits using the - option to limit MSA diversity.

- 12:35:03.853 INFO: Search results will be written to /tmp/hae/makeHHR_prot1_464127/prot1.hhr

- 12:35:03.853 ERROR: In /home/hae/bin/HHSuite3/hh-suite/src/ffindexdatabase.cpp:11: FFindexDatabase:

- 12:35:03.853 ERROR: 	could not open file '/home/hae/anaconda3/envs/peppi/PEPPI/lib/SPRINGDB/70negpos.db_cs219.ffdata'

benchmark: 0
Target: prot1
Query         prot1
Match_columns 234
No_of_seqs    552 out of 4235
Neff          11.2888
Searched_HMMs 2908
Date          Thu Jun 29 12:35:02 2023
Command       /home/hae/bin/HHSuite3/hh-suite/build/bin/hhblits -i /tmp/hae/makeHHR_prot1_464127/prot1.fasta -oa3m /tmp/hae/makeHHR_prot1_464127/prot1.a3m -d /home/hae/bin/HHSuite3/pdbDB/pdb70 -n 2 -e 0.001

How to I obtain the SPRINGDB/70negpos.db_cs219.ffdata file ?

Thanks for your help in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions