Journal Club (March 24, 2008)

Pei J, Kim BH, Grishin NV. PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 2008 Apr;36(7):2295-300.
Summary:
PROMALS3D generates accurate multiple sequence alignments using both sequence and structure information. PROMALS3D stands for "PROfile Multiple Alignment with predicted Local Structures and 3D constraints" and is available via web server: http://prodata.swmed.edu/promals3d
Background:
The earlier program PROMALS incorporates: PROMALS3D is PROMALS enhanced to use 3D structures. Other multiple sequence alignment programs have been enhanced with the ability to use structures, although the details differ: There are other web servers for these programs (at EBI, for example), but usually with fewer options exposed.
Method:


Most of the steps were already in PROMALS. Shaded boxes indicate the 3D part. Evaluation:
Table 1. Tests on SABmark database

SABmark-twi (209/10667) SABmark-sup (425/19092)
(# mult alignments/# pairwise ref alignments)
Method Q-score
(max ≈ 0.71)
GDT-TS Q-score
(max ≈ 0.87)
GDT-TS

PROMALS3D (D + S) 0.602 0.264 0.805 0.417
PROMALS3D (F + S) 0.555 0.220 0.779 0.390
PROMALS3D (T + S) 0.540 0.249 0.766 0.412
PROMALS3D (D + F + S) 0.611 0.256 0.812 0.414
PROMALS3D (D + T + S) 0.603 0.264 0.805 0.421
PROMALS3D (F + T + S) 0.595 0.251 0.800 0.413
PROMALS3D (D + F + T + S) 0.616 0.260 0.812 0.420
3DCoffee (D + S) 0.574 0.252 0.802 0.421
3DCoffee (SAP + S) 0.553 0.222 0.786 0.390
Expresso webserver 0.508 0.206
PROMALS3D (D/2 + S) 0.475 0.198 0.716 0.364
3DCoffee (D/2 + S) 0.261 0.100 0.573 0.294
3DCoffee (D/2 + SAP) 0.255 0.095 0.572 0.289
--------- sequence-only methods ---------
PROMALS 0.393 0.154 0.665 0.336
SPEM 0.326 0.124 0.628 0.318
MUMMALS 0.196 0.081 0.522 0.278
ProbCons 0.166 0.058 0.485 0.246
MAFFT-linsi 0.184 0.070 0.510 0.264
MUSCLE 0.136 0.056 0.433 0.233
T-Coffee 0.134 0.048 0.429 0.223
ClustalW 0.127 0.057 0.390 0.221
--------- structure-only methods ---------
MUSTANG 0.550 0.230 0.779 0.404
PROMALS3D (D) 0.594 0.252 0.802 0.415

PROMALS and SPEM also use predicted secondary structure
The best scores in a column are bold
D: using DaliLite structural constraints
F: using FAST structural constraints
T: using TM-align structural constraints
S: using sequence information
SAP: using SAP structural alignments
D/2: using DaliLite alignments for half of the sequences
SAP/2: using SAP alignments for half of the sequences

Do distantly related structures help? (are their alignments to the representatives' profiles sufficiently correct?)

Table 2. Tests on PREFAB database (Q-score results)

Method Set 1 (0.121/420) Set 2 (0.185/421) Set 3 (0.248/420) Set 4 (0.527/421) All (0.270/1682)

PROMALS3D (D + S) 0.817 0.879 0.921 0.954 0.893
PROMALS3D (F + S) 0.745 0.850 0.896 0.947 0.859
PROMALS3D (T + S) 0.766 0.856 0.902 0.950 0.869
PROMALS3D (D + F + S) 0.818 0.886 0.919 0.952 0.894
PROMALS3D (D + T + S) 0.834 0.884 0.922 0.953 0.898
PROMALS3D (F + T + S) 0.794 0.875 0.909 0.952 0.883
PROMALS3D (D + F + T + S) 0.836 0.894 0.917 0.956 0.900
--------- sequence-only methods ---------
PROMALS 0.570 0.771 0.875 0.946 0.790
SPEM 0.536 0.756 0.865 0.940 0.774
MUMMALS 0.457 0.693 0.834 0.939 0.731
ProbCons 0.428 0.672 0.826 0.936 0.716
MAFFT-linsi 0.443 0.681 0.826 0.938 0.722
MUSCLE 0.372 0.631 0.787 0.930 0.680
ClustalW 0.299 0.536 0.726 0.906 0.617

The total 1682 PREFAB alignments are divided into four semi-equal-sized sets according to sequence identity in the reference multiple alignment. The average sequence identity and number of alignments are in parentheses beneath the set names.

More about the server:

03/11/08: entered PDB IDs 2mnr, 4enl, 1nu5. The run finished about 10 minutes later and I downloaded an aligned fasta file; the full results at the PROMALS3D site will be kept for a month. This is not really a test of functionality or speed, but of usability and whether the server actually works (you might be surprised how many don't...). Pairwise sequence identities in the resulting alignment are 13.6%, 14.3%, 19.6%, and it is correct based on the resulting superposition in Chimera: