WORKLIST ENTRIES (1):
SRSVCYSPTASE View alignment Small round structured virus (C37) cysteine protease family signature
Type of fingerprint: COMPOUND with 9 elements
Links:
PRINTS; PR00705 PAPAIN; PR00704 CALPAIN; PR00966 NIAPOTYPTASE
PRINTS; PR00703 ADVENDOPTASE; PR00797 STREPTOPAIN; PR00707 UBCTHYDRLASE
PRINTS; PR00776 HEMOGLOBNASE; PR00706 PYROGLUPTASE; PR00864 PREPILPTASE
PRINTS; PR00916 2CENDOPTASE
INTERPRO; IPR001665
Creation date 29-JUN-1998; UPDATE 07-JUN-1999
1. RAWLINGS, N.D. AND BARRETT, A.J.
Families of cysteine peptidases.
METHODS ENZYMOL. 244 461-486 (1994).
2. BARRETT, A.J. AND RAWLINGS, N.D.
Families and clans of cysteine peptidases
PERSPECTIVES DRUG DISCOVERY DESIGN 6 1-11 (1996).
3. RAWLINGS, N.D. AND BARRETT, A.J.
Family C37 - Clan PA - Processing peptidase
http://www.bi.bbsrc.ac.uk/merops/famcards/c37.htm
4. FEDERHEN, S., HOTTON, C., LEIPE, D. AND SOUSSOV, V.
Calicivirus - NCBI Taxonomy Browser
http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wgetorg?id=11975&lvl=3
5. LIU, B., CLARKE, I.N. AND LAMBDEN, P.R.
Polyprotein processing in Southampton Virus: identification of 3C-like
protease cleavage sites by in vitro mutagenesis.
J.VIROL. 70(4) 2605-2610 (1996).
6. KOONIN, E.V. AND GORBALENYA, A.E.
An insect picornavirus may have genome organization similar to that of
caliciviruses.
FEBS LETT. 297 81-86 (1992).
Cysteine protease activity is dependent on an active dyad of cysteine and
histidine, the order and spacing of these residues varying in the known
families. Nearly half of all cysteine proteases are found exclusively
in viruses [1]. Cysteine protease families have been grouped into five
clans (designated CA, CB, CC, CD and CE) on the basis of structural and
functional similarity. Families C1, C2 and C10, which belong to the CA clan,
have a Cys/His catalytic diad, and are loosely termed papain-like. Families
in the CB clan have a His/Cys diad, and contain enzymes from RNA viruses
distantly related to chymotrypsin. Enzymes in clan CC are also from RNA
viruses, but have a papain like Cys/His active site. The remaining two
clans, CD and CE, contain only one family each [2]. Some families have not
yet been asigned to a clan.
Two additional clans (PA and PB) have been identified, these containing a
mixture of serine, cysteine and threonine proteases. Clan PA contains a
catalytically active serine or cysteine nucleophilic residue as part of the
ordered triad His, Asp, Ser (or Cys). Clan PB contains a serine, cysteine or
threonine active residue at the N-terminus of the mature protease [3].
Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis
[4]. The calicivirus genome contains two open reading frames, ORF1 and ORF2.
ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine
protease and RNA polymerase activity [5]. The regions of the polyprotein in
which these activities lie are similar to proteins produced by the picorna-
viruses [6]. ORF2 encodes a structural, capsid protein. Two different
families of caliciviruses can be distinguished on the basis of sequence
similarity, namely the Norwalk-like viruses or small round structured
viruses (SRSVs), and those classed as non-SRSVs.
Calicivirus proteases from the SRSV group, which are members of the PA
protease clan, constitute family C37 of the cysteine proteases (proteases
from non-SRSVs belong to the C24 family). As mentioned above, the protease
activity resides within a polyprotein. The enzyme cleaves the polyprotein
at sites N-terminal to itself, liberating the polyprotein helicase.
SRSVCYSPTASE is a 9-element fingerprint that provides a signature for the
cysteine protease (C37) of small round structured caliciviruses. The
fingerprint was derived from an initial alignment of 2 sequences: the motifs
were drawn from conserved regions spanning the full length of the poly-
protein protease, focusing on those regions that characterise members of
the C37 family but distinguish them from the C24 proteases - motif 3 encodes
the active site His residue; and motif 4 includes the catalytic Asp (the Cys
residue is located between motifs 7 and 8). Two iterations on OWL30.1 were
required to reach convergence, at which point a true set comprising 4
sequences was identified.
An update on SPTR37_9f identified a true set of 3 sequences.
SUMMARY INFORMATION
3 codes involving 9 elements
0 codes involving 8 elements
0 codes involving 7 elements
0 codes involving 6 elements
0 codes involving 5 elements
0 codes involving 4 elements
0 codes involving 3 elements
0 codes involving 2 elements
COMPOSITE FINGERPRINT INDEX
9| 3 3 3 3 3 3 3 3 3
8| 0 0 0 0 0 0 0 0 0
7| 0 0 0 0 0 0 0 0 0
6| 0 0 0 0 0 0 0 0 0
5| 0 0 0 0 0 0 0 0 0
4| 0 0 0 0 0 0 0 0 0
3| 0 0 0 0 0 0 0 0 0
2| 0 0 0 0 0 0 0 0 0
--+----------------------------------------------
| 1 2 3 4 5 6 7 8 9
True positives..
POLN_SOUV3 Q83883 POLN_LORDV
PROTEIN TITLES
POLN_SOUV3 NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYM
Q83883 NONSTRUCTURAL POLYPROTEIN - NORWALK VIRUS.
POLN_LORDV NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYM
SCAN HISTORY
OWL30_2 1 20 NSINGLE
SPTR37_9f 2 4 NSINGLE
INITIAL MOTIF SETS
SRSVCYSPTASE1 Length of motif = 20 Motif number = 1
Small round structured virus protease motif I - 1
PCODE ST INT
WADDDRSVDYNEKLDFEAPP POLN_LORDV 992 992
WADDEREVDYNEKISFEAPP POLN_SOUV3 1083 1083
SRSVCYSPTASE2 Length of motif = 20 Motif number = 2
Small round structured virus protease motif II - 1
PCODE ST INT
WSRIVNFGSGWGFWVSPSLF POLN_LORDV 1014 2
WSRVTKFGSGWGFWVSPTVF POLN_SOUV3 1105 2
SRSVCYSPTASE3 Length of motif = 22 Motif number = 3
Small round structured virus protease motif III - 1
PCODE ST INT
ITSTHVIPQGAQEFFGVPVKQI POLN_LORDV 1034 0
ITTTHVIPTSAKEFFGEPLTSI POLN_SOUV3 1125 0
SRSVCYSPTASE4 Length of motif = 23 Motif number = 4
Small round structured virus protease motif IV - 1
PCODE ST INT
IHKSGEFCRLRFPKPIRTDVTGM POLN_LORDV 1057 1
IHRAGEFTLFRFSKKIRPDLTGM POLN_SOUV3 1148 1
SRSVCYSPTASE5 Length of motif = 22 Motif number = 5
Small round structured virus protease motif V - 1
PCODE ST INT
LEEGAPEGTVVTLLIKRSTGEL POLN_LORDV 1081 1
LEEGCPEGTVCSVLIKRDSGEL POLN_SOUV3 1172 1
SRSVCYSPTASE6 Length of motif = 19 Motif number = 6
Small round structured virus protease motif VI - 1
PCODE ST INT
PLAARMGTHATMKIQGRTV POLN_LORDV 1104 1
PLAVRMGAIASMRIQGRLV POLN_SOUV3 1195 1
SRSVCYSPTASE7 Length of motif = 18 Motif number = 7
Small round structured virus protease motif VII - 1
PCODE ST INT
GQMGMLLTGSNAKSMDLG POLN_LORDV 1124 1
GQSGMLLTGANAKGMDLG POLN_SOUV3 1215 1
SRSVCYSPTASE8 Length of motif = 20 Motif number = 8
Small round structured virus protease motif VIII - 1
PCODE ST INT
YKRENDYVVIGVHTAAARGG POLN_LORDV 1153 11
YKRANDWVVCGVHAAATKSG POLN_SOUV3 1244 11
SRSVCYSPTASE9 Length of motif = 20 Motif number = 9
Small round structured virus protease motif IX - 1
PCODE ST INT
NTVICATQGSEGEATLEGGD POLN_LORDV 1173 0
NTVVCAVQASEGETTLEGGD POLN_SOUV3 1264 0
FINAL MOTIF SETS
SRSVCYSPTASE1 Length of motif = 20 Motif number = 1
Small round structured virus protease motif I - 2
PCODE ST INT
WADDEREVDYNEKISFEAPP POLN_SOUV3 1083 1083
WADDDREVDYNEKINFEAPP Q83883 1084 1084
WADDDRSVDYNEKLDFEAPP POLN_LORDV 992 992
SRSVCYSPTASE2 Length of motif = 20 Motif number = 2
Small round structured virus protease motif II - 2
PCODE ST INT
WSRVTKFGSGWGFWVSPTVF POLN_SOUV3 1105 2
WSRVTKFGSGWGFWVSPTVF Q83883 1106 2
WSRIVNFGSGWGFWVSPSLF POLN_LORDV 1014 2
SRSVCYSPTASE3 Length of motif = 22 Motif number = 3
Small round structured virus protease motif III - 2
PCODE ST INT
ITTTHVIPTSAKEFFGEPLTSI POLN_SOUV3 1125 0
ITTTHVVPTGVKEFFGEPLSSI Q83883 1126 0
ITSTHVIPQGAQEFFGVPVKQI POLN_LORDV 1034 0
SRSVCYSPTASE4 Length of motif = 23 Motif number = 4
Small round structured virus protease motif IV - 2
PCODE ST INT
IHRAGEFTLFRFSKKIRPDLTGM POLN_SOUV3 1148 1
IHQAGEFTQFRFSKKMRPDLTGM Q83883 1149 1
IHKSGEFCRLRFPKPIRTDVTGM POLN_LORDV 1057 1
SRSVCYSPTASE5 Length of motif = 22 Motif number = 5
Small round structured virus protease motif V - 2
PCODE ST INT
LEEGCPEGTVCSVLIKRDSGEL POLN_SOUV3 1172 1
LEEGCPEGTVCSVLIKRDSGEL Q83883 1173 1
LEEGAPEGTVVTLLIKRSTGEL POLN_LORDV 1081 1
SRSVCYSPTASE6 Length of motif = 19 Motif number = 6
Small round structured virus protease motif VI - 2
PCODE ST INT
PLAVRMGAIASMRIQGRLV POLN_SOUV3 1195 1
PLAVRMGAIASMRIQGRLV Q83883 1196 1
PLAARMGTHATMKIQGRTV POLN_LORDV 1104 1
SRSVCYSPTASE7 Length of motif = 18 Motif number = 7
Small round structured virus protease motif VII - 2
PCODE ST INT
GQSGMLLTGANAKGMDLG POLN_SOUV3 1215 1
GQSGMLLTGANAKGMDLG Q83883 1216 1
GQMGMLLTGSNAKSMDLG POLN_LORDV 1124 1
SRSVCYSPTASE8 Length of motif = 20 Motif number = 8
Small round structured virus protease motif VIII - 2
PCODE ST INT
YKRANDWVVCGVHAAATKSG POLN_SOUV3 1244 11
HKRGNDWVVCGVHAAATKSG Q83883 1245 11
YKRENDYVVIGVHTAAARGG POLN_LORDV 1153 11
SRSVCYSPTASE9 Length of motif = 20 Motif number = 9
Small round structured virus protease motif IX - 2
PCODE ST INT
NTVVCAVQASEGETTLEGGD POLN_SOUV3 1264 0
NTVVCAVQAGEGETALEGGD Q83883 1265 0
NTVICATQGSEGEATLEGGD POLN_LORDV 1173 0
User query: Display/Full Code "SRSVCYSPTASE"