Accessibility navigation


The IntFOLD server help page

This page contains simple guidelines for using the new version of the IntFOLD server, sample input data which may be downloaded and submitted and examples of output from the server.

Guidelines for using the server

The only required input for using the IntFOLD server is the amino acid sequence for your target protein.

If you wish, you may also provide your email address, which will only be used to provide you with a link to the results when your predictions are completed. However, if you do not provide your email address then you should bookmark the results page so you can view the results when they are available.

Required - Sequence Data


In the text box labelled "Input sequence of protein target" please carefully paste in the full amino acid sequence for your target protein in single letter format. An example sequence (CASP15 target T1106s2) is shown below:

Sample sequence:
MNITLTKRQQEFLLLNGWLQLQCGHAERACILLDALLTLNPEHLAGRRCRLVALLNNNQG
ERAEKEAQWLISHDPLQAGNWLCLSRAQQLNGDLDKARHAYQHYLELKDHNESP
	
It is important that you provide the full sequence that corresponds to the sequence of residue coordinates in any model files that you might optionally provide. If your model does not contain numbering that corresponds directly to the order of residues in the sequence file then the server will attempt to renumber the residues in the model files accordingly. However, if there are residues in any model file that are not contained in the provided sequence then the prediction for that model will not complete.

Optional - Short name for sequence


If you wish, you may assign a short memorable name to your prediction job. This will be useful for identifying particular jobs in your inbox and because IntFOLD server will not necessarily return your results in the order you submitted them. The set of characters you can use for the filename are restricted to letters A-Z (either case), the numbers 0-9 and the following other characters: .~_-

The name you specify will be included in the subject line of the e-mail messages sent to you from the server.

Optional - E-mail address


If you wish, you may provide your e-mail address. You will be sent a link to the graphical results and machine readable results when your predictions are completed.

Privacy Notice: Processing of personal data will be in accordance with the GDPR and University of Reading (UoR) Data Protection Policy. Users' IP addresses will be temporarily stored in the queuing system and then used to generate anonymous usage statistics. Optionally, users may provide an email address, so that they can be notified when their job completes; this will be deleted when no longer required. Personal data will be accessible only by UoR staff managing the server, and will not be stored for longer than is necessary for the provision of the service. Your results will be available via a unique URL, which will not be posted publicly. Your sequences and structures will not be used for any other purposes and will deleted after the expiry date (21-28 days).

Optional - Short name for sequence


If you wish, you may assign a short memorable name to your prediction job. This is useful so that you can identify particular jobs in your mailbox. This is particularly important because ReFOLD will not necessarily return your results in the order you submitted them. The set of characters you can use for the filename are restricted to letters A-Z (either case), the numbers 0-9 and the following other characters: .~_- The name you specify will be included in the subject line of the e-mail messages sent to you from the server.

Output from the server


The IntFOLD server produces a results table containing numerical and graphical prediction results. The raw machine readable prediction data is also provided in CASP format

Examples of output:
  1. IntFOLD results for CASP15 target T1106s2
  2. IntFOLD results for CASP15 target T1114s2
  3. IntFOLD results for CASP15 target T1124
  4. IntFOLD results for CASP15 target H1135 chain A

Description of output:
  1. Top 5 3D models - The results table is ranked according to decreasing global model quality score. The global model quality scores range between 0 and 1. In general scores less than 0.2 indicate there may be incorrectly modelled domains and scores greater than 0.4 generally indicate more complete and confident models, which are highly similar to the native structure. If the global model quality scores are low, then the per-residue scores can give you an idea of specific domains or regions in your protein that might be correctly modelled (in this case you may wish to divide up your sequence into sub domains and resubmit).

    The consistency of the global scores allows us to calculate a p-value which represents the probability that each model is incorrect. That is to say, that for a given predicted model quality score, the p-value represents the proportion of incorrect models (TM-score < 0.2) with that score. Each model is also assigned a colour coded confidence level depending on the p-value:

    P-value cut-offConfidenceDescription
    p < 0.001CERTLess than 1/1000 incorrect models will have higher scores.
    p < 0.01HIGHLess than 1/100 incorrect models will have higher scores.
    p < 0.05MEDIUMLess than 1/20 incorrect models will have higher scores.
    p < 0.1LOWLess than a 1/10 incorrect models will have higher scores.
    p > 0.1UNCERTMore than 1/10 incorrect models will have higher scores.
    The confidence scores should be considered in conjunction with the local model quality (per-residue scores) and the coverage of the target protein by the template/templates. The per-residue scores indicate the predicted distance (in Angstroms) between the CA atom of the residue in the model and the CA atom of the equivalent residue in the native structure. Thumbnail images of plots depicting the per-residue error versus residue number are included in each row in the results table. Each of the thumbnails links to a page that displays a larger view of the plot and contains a further link to download a PostScript version. Each row in the table also displays a thumbnail of the 3D cartoon view of the model which is colour coded with the residue error according to the RasMol temperature colouring scheme. Under each small image features two buttons. The button named "View model in 3D and download" links to a page that shows a larger image of the 3D view and contains a link to download a PDB file of the model with residue accuracy predictions (Angstroms) in the B-factor column. The model is also loaded into JSmol viewers for convenient interactive visualisation of per-residue errors and coverage of the target protein by the template/s. The button "Refine model using ReFOLD" submits this model to the ReFOLD service for refinement.

  2. Disorder prediction - The image shows a plot of the probablity of disorder (on the y axis) for each numbered amino acid in the sequence (on the x axis). The disorder/order probability threshold is shown as a dashed line on the plot. Residues above the threshold could be considered as mostly disordered and below as mostly ordered, however this threshold serves only to guide the user. A PostScript version of the plot may be downloaded by clicking on the image.

  3. Domain boundary prediction - The image shows the top predicted 3D model coloured to indicate predicted domains - a change in colour indicates a likely domain boundary. Clicking on the image will link you to a page where you can download a PDB file of the top model with the domain number for each residue provided in the B-factor column. The model is also loaded into a JSmol viewer providing a convienient interactive view of the predicted domains.

  4. Binding site prediction - The image shows the top predicted 3D model annotated to indicate putative binding site residues. The cartoon view of the model is shown in green and the binding site resides are shown as blue sticks with labelled residues. The view is zoomed and centred on the first (N-term) binding residue. Clicking on the image will link you to a page where you can download a PDB file of the top model with all identified ligands in superposed in their likely positions relative to the model.
    Below the download link, a list of the binding residues is provided along with the most likely (numerous) ligand, the ligand identified at nearest to the centre of the predicted binding pocket and a list of the likely interacting ligands and the number of each that were identified in related template structures. The JSmol view provides numerous options for viewing the ligand binding site prediction. In the default view, for clarity, the binding site residues are shown as sticks and the labels and ligands are switched off.

  5. Full model quality assessment results - A full summary of the Quality Assessment results are shown in this table for all generated models and any of the additional models submitted by the user. The table is similarly formatted to the table showing the top 5 models.

Fair usage policy


You are only permitted to have 1 job running at a time for each IP address, so please wait until your previous job completes before submitting further data. If you already have a job running then you will be notified and your uploaded files will be deleted. Once your job has completed your IP address will be unlocked and you will be able to submit new data.

If you have any questions or if you wish to submit numerous sequences or batch jobs then please contact l.j.mcguffin@reading.ac.uk, with a short description of your project.

Note: template based models from our older legacy methods use Modeller. If you intend to use any of the lower quality template-based models for commercial reasons, then you should obtain a Modeller access key and submit here.

Page navigation

 

Search Form