WORKLIST ENTRIES (1):
2CENDOPTASE View alignment 2C endopeptidase (C24) cysteine protease family signature
Type of fingerprint: COMPOUND with 4 elements
Links:
PRINTS; PR00705 PAPAIN; PR00704 CALPAIN; PR00966 NIAPOTYPTASE
PRINTS; PR00703 ADVENDOPTASE; PR00797 STREPTOPAIN; PR00707 UBCTHYDRLASE
PRINTS; PR00776 HEMOGLOBNASE; PR00706 PYROGLUPTASE; PR00864 PREPILPTASE
PRINTS; PR00917 SRSVCYSPTASE
INTERPRO; IPR000317
Creation date 29-JUN-1998; UPDATE 06-JUN-1999
1. RAWLINGS, N.D. AND BARRETT, A.J.
Families of cysteine peptidases.
METHODS ENZYMOL. 244 461-486 (1994).
2. BARRETT, A.J. AND RAWLINGS, N.D.
Families and clans of cysteine peptidases
PERSPECTIVES DRUG DISCOVERY DESIGN 6 1-11 (1996).
3. RAWLINGS, N.D. AND BARRETT, A.J.
Family C24 - Clan PA - 3C endopeptidase
http://www.bi.bbsrc.ac.uk/merops/famcards/c24.htm
4. FEDERHEN, S., HOTTON, C., LEIPE, D. AND SOUSSOV, V.
Calicivirus - NCBI Taxonomy Browser
http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wgetorg?id=11975&lvl=3
5. WIRBLICH, C., THIEL,H. AND MEYERS, G.
Genetic map of the calicivirus rabbit hemorrhagic diesease virus as detected
from in vitro translation studies.
J.VIROL. 70(11) 7974-7983 (1996).
Cysteine protease activity is dependent on an active dyad of cysteine and
histidine, the order and spacing of these residues varing in the known
families. Nearly half of all cysteine proteases are found exclusively
in viruses [1]. Cysteine protease families have been grouped into five
clans (designated CA, CB, CC, CD and CE) on the basis of structural and
functional similarity. Families C1, C2 and C10, which belong to the CA clan,
have a Cys/His catalytic diad, and are loosely termed papain-like. Families
in the CB clan have a His/Cys diad, and contain enzymes from RNA viruses
distantly related to chymotrypsin. Enzymes in clan CC are also from RNA
viruses, but have a papain-like Cys/His active site. The remaining two
clans, CD and CE, contain only one family each [2]. Some families have not
yet been asigned to a clan.
Two additional clans (PA and PB) have been identified, these containing a
mixture of serine, cysteine and threonine proteases. Clan PA contains a
catalytically-active serine or cysteine nucleophilic residue as part of the
ordered triad His, Asp, Ser (or Cys). Clan PB contains a serine, cysteine or
threonine active residue at the N-terminus of the mature protease [3].
Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis
[4]. The calicivirus genome contains two open reading frames, ORF1 and ORF2.
ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine
protease and RNA polymerase activity. The regions of the polyprotein in
which these activities lie are similar to proteins produced by the picorna-
viruses. ORF2 encodes a structural protein [5]. Two different families of
caliciviruses can be distinguished on the basis of sequence similarity,
namely those classified as small round structured viruses (SRSVs) and those
classed as non-SRSVs.
Calicivirus proteases from the non-SRSV group, which are members of the PA
protease clan, constitute family C24 of the cysteine proteases (proteases
from SRSVs belong to the C37 family). As mentioned above, the protease
activity resides within a polyprotein. The enzyme cleaves the polyprotein
at sites N-terminal to itself, liberating the polyprotein helicase.
2CENDOPTASE is a 4-element fingerprint that provides a signature for the
cysteine protease (C24) of non-SRSV caliciviruses. The fingerprint was
derived from an initial alignment of 4 sequences: the motifs were drawn
from conserved regions spanning the full length of the polyprotein protease,
focusing on those regions that characterise members of the C24 family but
distinguish them from the C37 proteases - motif 1 includes the active site
histidine residue; and motif 3 contains the catalytic cysteine. Two
iterations on OWL30.2 were required to reach convergence, at which point
a true set comprising 14 sequences was identified.
An update on SPTR37_9f identified a true set of 12 sequences.
SUMMARY INFORMATION
12 codes involving 4 elements
0 codes involving 3 elements
0 codes involving 2 elements
COMPOSITE FINGERPRINT INDEX
4| 12 12 12 12
3| 0 0 0 0
2| 0 0 0 0
--+---------------------
| 1 2 3 4
True positives..
Q89273 Q86119 Q86117 POLN_RHDV
Q86114 POLN_FCVF9 Q96725 Q66913
POLN_FCVC6 Q66914 O92368 POLN_MANCV
PROTEIN TITLES
Q89273 POLYPROTEIN - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
Q86119 POLYPROTEIN - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
Q86117 (SD) - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
POLN_RHDV NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYM
Q86114 POLYPROTEIN - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
POLN_FCVF9 NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYM
Q96725 RNA - EUROPEAN BROWN HARE SYNDROME VIRUS.
Q66913 NON-STRUCTURAL PROTEINS - FELINE CALICIVIRUS.
POLN_FCVC6 NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYM
Q66914 POLYPROTEIN - FELINE CALICIVIRUS.
O92368 NON-STRUCTURAL POLYPROTEIN - VESV-LIKE CALICIVIRUS.
POLN_MANCV GENOME POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE (E
SCAN HISTORY
OWL30_2 2 50 NSINGLE
SPTR37_9f 2 13 NSINGLE
INITIAL MOTIF SETS
2CENDOPTASE1 Length of motif = 18 Motif number = 1
2C endopeptidase calicivirus protease motif I - 1
PCODE ST INT
GYCIHMGHGVYASVAHVV POLN_FCVF9 1095 1095
GYCVHMGHGVYASVAHVV POLN_FCVC6 1097 1097
GWMIHIGNGLYISNTHTA POLN_RHDV 1120 1120
GYGVHIGNGNVITVTHVA POLN_MANCV 997 997
2CENDOPTASE2 Length of motif = 17 Motif number = 2
2C endopeptidase calicivirus protease motif II - 1
PCODE ST INT
APFFSGKPTRDPWGSPV POLN_FCVF9 1145 32
APFFSGRPTRDPWGSPV POLN_FCVC6 1147 32
AQIAEGTPVCDWKKSPI POLN_RHDV 1165 27
GPFSQLPHMQIGSGSPV POLN_MANCV 1039 24
2CENDOPTASE3 Length of motif = 12 Motif number = 3
2C endopeptidase calicivirus protease motif III - 1
PCODE ST INT
THPGDCGLPYID POLN_FCVF9 1188 26
THPGDCGLPYID POLN_FCVC6 1190 26
TTHGDCGLPLYD POLN_RHDV 1207 25
TKKGDCGLPYFN POLN_MANCV 1092 36
2CENDOPTASE4 Length of motif = 11 Motif number = 4
2C endopeptidase calicivirus protease motif IV - 1
PCODE ST INT
DNGRVTGLHTG POLN_FCVF9 1200 0
DNGRVTGLHTG POLN_FCVC6 1202 0
SSGKIVAIHTG POLN_RHDV 1219 0
SNRQLVALHAG POLN_MANCV 1104 0
FINAL MOTIF SETS
2CENDOPTASE1 Length of motif = 18 Motif number = 1
2C endopeptidase calicivirus protease motif I - 2
PCODE ST INT
GWMIHIGNGLYISNTHTA POLN_RHDV 1120 1120
GWMIHIGNGLYISNTHTA Q86117 1120 1120
GWMIHIGNGLYISNTHTA Q86119 1120 1120
GWMIHIGNGLYISNTHTA Q89273 1120 1120
GRMIHIGNGLYISNTHTA Q86114 1120 1120
GYCIHMGHGVYASVAHVV POLN_FCVF9 1095 1095
GWMIHIGNGMYLSNTHTA Q96725 1113 1113
GYCVHMGHGVYASVAHVV Q66913 1095 1095
GYCVHMGHGVYASVAHVV POLN_FCVC6 1097 1097
GYCVHMGHGVYATVAHVA Q66914 1095 1095
GYAIHIGHGVYISLKHVV O92368 1208 1208
GYGVHIGNGNVITVTHVA POLN_MANCV 997 997
2CENDOPTASE2 Length of motif = 17 Motif number = 2
2C endopeptidase calicivirus protease motif II - 2
PCODE ST INT
AQIAEGTPVCDWKKSPI POLN_RHDV 1165 27
AQIAEGTPVCDWKKSPI Q86117 1165 27
AQIAEGTPVCDWKKSPI Q86119 1165 27
AQIAEGTPVCDWKKSPI Q89273 1165 27
AQIAEGTPVCDWKKSPI Q86114 1165 27
APFFSGKPTRDPWGSPV POLN_FCVF9 1145 32
AQIAEGTPVRDWKRASI Q96725 1158 27
APFFSGKPTRDPWGSPV Q66913 1145 32
APFFSGRPTRDPWGSPV POLN_FCVC6 1147 32
APFFPGKPTRDPWGSPV Q66914 1145 32
VPVGTSKPIKDPWGNPV O92368 1258 32
GPFSQLPHMQIGSGSPV POLN_MANCV 1039 24
2CENDOPTASE3 Length of motif = 12 Motif number = 3
2C endopeptidase calicivirus protease motif III - 2
PCODE ST INT
TTHGDCGLPLYD POLN_RHDV 1207 25
TTHGDCGLPLYD Q86117 1207 25
TTHGDCGLPLYD Q86119 1207 25
TTHGDCGLPLYD Q89273 1207 25
TTHGDCGLPLYD Q86114 1207 25
THPGDCGLPYID POLN_FCVF9 1188 26
TTHGDCGLPLFD Q96725 1200 25
THPGDCGLPYID Q66913 1188 26
THPGDCGLPYID POLN_FCVC6 1190 26
THPGDCGLPYID Q66914 1188 26
TRQGDCGLPYVD O92368 1301 26
TKKGDCGLPYFN POLN_MANCV 1092 36
2CENDOPTASE4 Length of motif = 11 Motif number = 4
2C endopeptidase calicivirus protease motif IV - 2
PCODE ST INT
SSGKIVAIHTG POLN_RHDV 1219 0
SSGKIVAIHTG Q86117 1219 0
SSGKIVAIHTG Q86119 1219 0
SSGKIVAIHTG Q89273 1219 0
SSGKIVAIHTG Q86114 1219 0
DNGRVTGLHTG POLN_FCVF9 1200 0
EAGKVVAIHTG Q96725 1212 0
DNGRVTGLHTG Q66913 1200 0
DNGRVTGLHTG POLN_FCVC6 1202 0
DNGRVTGLHTG Q66914 1200 0
DHGVVVGLHAG O92368 1313 0
SNRQLVALHAG POLN_MANCV 1104 0
User query: Display/Full Code "2CENDOPTASE"