The ModFOLD Server Help Page
Input Sequence
Please carefully paste in the full amino acid sequence of your target protein in single letter format e.g.
MSFIEKMIGSLNDKREWKAMEARAKALPKEYHHAYKAIQKYMWTSGGPTDWQDTKRIFGG ILDLFEEGAAEGKKVTDLTGEDVAAFCDELMKDTKTWMDKYRTKLNDSIGRD
Select program
So, which method should you use?If in doubt use ModFOLD v 2.0 - this method attempts to combine the results from ModFOLD and ModFOLDclust using a simplified interface for job submission.
You can only use ModFOLDclust if you have multiple models for your target sequence. ModFOLDclust works best if you have several models, if possible built from alternative target-template alignments, using several different methods.
Clustering methods such as ModFOLDclust are currently the most accurate methods, but the ModFOLDclust results will not be as reliable if you only have a few models built using the same target-template alignment.
ModFOLD v1.1 will be the most useful if you want a consistent score and a p-value for a single model, but in general it is preferable to compare multiple models.
Select a program depending on your requirements:
Program | Speed* | Multiple Models | Single Model | Consistent score and p-value | Accuracy | Local Quality Scoring (Residue error) |
ModFOLD v 2.0 | Slow | Yes | Yes | No | Comparable to ModFOLDclust, but also works on single models | Yes |
ModFOLD v 1.1 | Fast | Yes | Yes | Yes | More accurate than ModFOLDclust if very few models are available | No |
ModFOLDclust v 1.1 | Slow | Yes | No | No | Most accurate, but only if you have many models (20+) from different sources | Yes |
* The CPU time for the ModFOLD v 1.1 method increases linearly with the increase in models. However, the CPU time for the ModFOLDclust v 1.1 and ModFOLD v 2.0 methods increases quadratically with the increase in models. Depending on the load on the server, the ModFOLD method can process about 1000 models in about 10-15 min for an average sized protein (~200 residues), where as the ModFOLDclust method could only process about 100 in the same time.
N.B. DISOclust - If you are using the ModFOLDclust v 1.1 or ModFOLD v 2.0 methods, then you also have the option of receiving disorder prediction results. Click here for further explanation.
Upload models
You may either upload a single PDB file (for quality assessment of a single model) or multiple models in the form of a gzipped tar file. The gzipped tar file should contain a directory of models for your target sequence as separate PDB files. This file should be similar in format to the tarballs of 3D coordinates on the CASP website, which can be found here.
An example file containing models for the above sequence can be downloaded here.
Steps to produce the appropriate file:
Linux/MacOS/Irix/Solaris/other Unix users
1. Tar up the directory containing your PDB files e.g. type the following at the command line:
tar cvf my_models.tar my_models/
2. Gzip the tar file e.g.
gzip my_models.tar
3. Upload the gzipped tar file (e.g. my_models.tar.gz) to the ModFOLD server
Windows users
In Windows you can use a free application such as 7-zip to tar and gzip your models.
1. Download, install and run 7-zip
2. Select the directory (folder) of model files to add to the .tar file, click "Add", select the "tar" option as the "Archive format:" and save the file as something memorable e.g. my_models.tar
3. Select the tar file, click "Add" and then select the "GZip" option as the "Archive format:" - the file should then be saved as my_models.tar.gz
4. Upload the the gzipped tar file (e.g. my_models.tar.gz) to the ModFOLD server
Formatting for model files
Please ensure that each PDB file contains the coordinates for one model only. Please do not upload a single PDB file containing the coordinates for multiple alternative NMR models. The coordinates for multiple models should be uploaded as a tarred and gzipped directory of separate files.
The server will attempt to automatically renumber the ATOM records in each model in order to match the residue positions in the sequence i.e. the coordinates for the first residue in the sequence will be renumbered "1" in each model file (if they aren't already), the coordinates for the second residue in the sequence will be numbered "2", and so on.
File limit
The ModFOLDclust method is CPU intensive, therefore currently this method is limited to 1000 models per gzipped tar file. There is currently no limit on the number of files you can process using the ModFOLD method, however you may be limited by your upload bandwidth when submitting files or by your download bandwidth when viewing HTML results.
E-mail Address
Enter your e-mail address here. Results will be returned as soon as they are available.
Short name for sequence
Use this field to assign a short memorable name to your prediction job. This is useful so that you can identify particular jobs in your mailbox. This is particularly important because ModFOLD will not necessarily return your results in the order you submitted them. The set of characters you can use for the filename are restricted to letters A-Z (either case), the numbers 0-9 and the following other characters: .~_-
The name you specify will be included in the subject line of the e-mail messages sent to you from the server.
Results
The results will be returned via email. The email contains a URL, which links to HTML formatted graphical results, and an attached file, which provides results in a machine readable format.
Machine readable attached files
There are two formats of machine readable output depending on the method used. The ModFOLD v 1.1 results are in CASP QA (QMODE1) format and the ModFOLDclust v 1.1 and ModFOLD v 2.0 results are in CASP QA (QMODE2) format.
ModFOLD v 1.1 web pages and graphical results
The HTML results pages are also different for each method. The ModFOLD v 1.1 results page consists of a table sorted by decreasing predicted model quality score (click here for an example). The consistency of the ModFOLD score allows us to calculate a p-value which represents the probability that the model is incorrect. That is to say, that for a given predicted model quality score, the p-value is the proportion of models with that score that do not share any similarity with the the native structure (TM-score < 0.2). Each model is also assigned a colour coded confidence level depending on the p-value:
Cut-off | Confidence | Description |
p < 0.01 | HIGH | Less than a 1/100 chance that the model is incorrect. |
p < 0.05 | MEDIUM | Less than a 1/20 chance that the model is incorrect. |
p < 0.1 | LOW | Less than a 1/10 chance that the model is incorrect. |
p > 0.1 | RANDOM | Likely to be a random model. |
ModFOLDclust v 1.1 web pages and graphical results
The ModFOLDclust v 1.1 results table is ranked according to decreasing clustering score (click here for an example). Each row includes a small JPEG image of a plot depicting the residue error versus the residue number. Each small image links to a page that displays a larger view of the plot and contains a further link to download a PostScript version.
Each row in the results table also displays a small 3D cartoon view of the model which is colour coded with the residue error according to the RasMol temperature colouring scheme. Each small image also links to a page that shows a larger image of the 3D view and contains a link to download a PDB file of the model with residue accuracy predictions (Angstroms) in the B-factor column.
ModFOLDclust v 2.0 web pages and graphical results
These are similar in format to the ModFOLDclust v 1.1 results pages.