The 3 Cheshire Steps
We assume that you have downloaded and installed the Cheshire package. If not follow this link.
Data files and scripts for this example can be downloaded here
To start working Cheshire needs two files:
- the sequence file (1UZC.fasta) in fasta format containing the sequence of the protein.
- and the chemical shift file (1UZC.cs) with the chemical shift assignment of the backbone.
Optionally, a PDB file (1UZC.pdb) may be provided. This file is not used in the generation of the structures, but only to compute statistics.
Fasta file 1UZC.fasta
Chemical shift file 1UZC.cs
1 MET 0 0 0 0 0 0 2 LYS 0 0 0 0 0 0 3 MET 4.47 8.51 122.3 55.2 33.4 0 4 LYS 4.33 8.51 123 56 32.8 0 5 THR 4.17 7.98 115.2 62.1 69.3 0 6 LYS 4.38 8.28 124.2 55.2 35.8 0 7 ILE 4.49 8.28 120.1 59.7 38.5 0 8 PHE 5.28 8.94 122.3 55.8 42.6 0 9 ARG 5.08 9.59 122.7 54.9 32.7 0 10 VAL 5.01 9.66 129.1 60 34.1 0 ...
Description:The format of the chemical shift file should be straightforward. The first two columns are the sequence number and the amino acid three letter code. The six numerical fields contain the chemical shifts of the HA, H, N, CA, CB and C atoms. A value of zero means that a chemical shift is missing.
Note:Note: The chemical shift file should contain the same number of amino-acids as the sequence file.
To bootstrap cheshire create a directory called 1UZC and copy the files 1UZC.pdb, 1UZC.fasta and 1UZC.cs to it.
cheshire requires several scripts. These can be generated using the template script mkcheshire_tmpl.z
almost -f mkcheshire_tmpl.z -- -fasta 1UZC.fasta -cs 1UZC.cs -pdb 1UZC.pdb \ -pro 1UZC -dir 1UZC_OUT -nmod 1000 -seed 3085931
This command will generate the following scripts:
The first script that you have to run is cheshire.z.
almost -f cheshire.z
The script cheshire.z will create the fragment libraries required for the generation of the low-resolution structures. It usually takes between 10-15 minutes to complete.
Once the fragments library are computed you can start with the generation of the low-resolution models with:
almost -f generate.z &
Each run of the generate.z script will compute 1000 models. Several instances of the script can run in parallel. You can start 4 of them on a multi-core computer with the following shell command:
for i in 1 2 3 4; do nohup almost -f generate.z > generate$i".log" & done
The models computed by the generate.z script are saved in the file rose.ang in a compressed format. To extract the models as PDB, use the command:
almost -f ang2pdb.z
that will extract the 1000 best scoring models to a sub-directory called ... ANG
At this point the models are without side-chains. To add them and select a the most promising ones for the refinement run:
almost -f select.z
Now, you are almost done. Run the script refine.z to refine the models. You can run several instances of the script in parallel with:
for i in 1 2 3 4; do nohup almost -f refine.z 1000 HREF 50 > /dev/null & none