WORKLIST ENTRIES (1):

SRSVCYSPTASE View alignment     Small round structured virus (C37) cysteine protease family signature
 Type of fingerprint: COMPOUND with 9  elements
Links:
   PRINTS; PR00705 PAPAIN; PR00704 CALPAIN; PR00966 NIAPOTYPTASE
   PRINTS; PR00703 ADVENDOPTASE; PR00797 STREPTOPAIN; PR00707 UBCTHYDRLASE
   PRINTS; PR00776 HEMOGLOBNASE; PR00706 PYROGLUPTASE; PR00864 PREPILPTASE
   PRINTS; PR00916 2CENDOPTASE
   INTERPRO; IPR001665

 Creation date 29-JUN-1998; UPDATE 07-JUN-1999

   1. RAWLINGS, N.D. AND BARRETT, A.J.
   Families of cysteine peptidases.
   METHODS ENZYMOL. 244 461-486 (1994).

   2. BARRETT, A.J. AND RAWLINGS, N.D.
   Families and clans of cysteine peptidases
   PERSPECTIVES DRUG DISCOVERY DESIGN 6 1-11 (1996).

   3. RAWLINGS, N.D. AND BARRETT, A.J.
   Family C37 - Clan PA - Processing peptidase
   http://www.bi.bbsrc.ac.uk/merops/famcards/c37.htm

   4. FEDERHEN, S., HOTTON, C., LEIPE, D. AND SOUSSOV, V.
   Calicivirus - NCBI Taxonomy Browser
   http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wgetorg?id=11975&lvl=3

   5. LIU, B., CLARKE, I.N. AND LAMBDEN, P.R.
   Polyprotein processing in Southampton Virus: identification of 3C-like
   protease cleavage sites by in vitro mutagenesis.
   J.VIROL. 70(4) 2605-2610 (1996).

   6. KOONIN, E.V. AND GORBALENYA, A.E.
   An insect picornavirus may have genome organization similar to that of 
   caliciviruses.
   FEBS LETT. 297 81-86 (1992).

   Cysteine protease activity is dependent on an active dyad of cysteine and
   histidine, the order and spacing of these residues varying in the known 
   families. Nearly half of all cysteine proteases are found exclusively
   in viruses [1]. Cysteine protease families have been grouped into five 
   clans (designated CA, CB, CC, CD and CE) on the basis of structural and
   functional similarity. Families C1, C2 and C10, which belong to the CA clan,
   have a Cys/His catalytic diad, and are loosely termed papain-like. Families
   in the CB clan have a His/Cys diad, and contain enzymes from RNA viruses
   distantly related to chymotrypsin. Enzymes in clan CC are also from RNA
   viruses, but have a papain like Cys/His active site. The remaining two
   clans, CD and CE, contain only one family each [2]. Some families have not
   yet been asigned to a clan. 
  
   Two additional clans (PA and PB) have been identified, these containing a
   mixture of serine, cysteine and threonine proteases. Clan PA contains a 
   catalytically active serine or cysteine nucleophilic residue as part of the
   ordered triad His, Asp, Ser (or Cys). Clan PB contains a serine, cysteine or
   threonine active residue at the N-terminus of the mature protease [3]. 
  
   Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis
   [4]. The calicivirus genome contains two open reading frames, ORF1 and ORF2.
   ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine
   protease and RNA polymerase activity [5]. The regions of the polyprotein in 
   which these activities lie are similar to proteins produced by the picorna-
   viruses [6]. ORF2 encodes a structural, capsid protein. Two different 
   families of caliciviruses can be distinguished on the basis of sequence
   similarity, namely the Norwalk-like viruses or small round structured 
   viruses (SRSVs), and those classed as non-SRSVs. 
  
   Calicivirus proteases from the SRSV group, which are members of the PA
   protease clan, constitute family C37 of the cysteine proteases (proteases 
   from non-SRSVs belong to the C24 family). As mentioned above, the protease 
   activity resides within a polyprotein. The enzyme cleaves the polyprotein 
   at sites N-terminal to itself, liberating the polyprotein helicase. 
  
   SRSVCYSPTASE is a 9-element fingerprint that provides a signature for the 
   cysteine protease (C37) of small round structured caliciviruses. The 
   fingerprint was derived from an initial alignment of 2 sequences: the motifs
   were drawn from conserved regions spanning the full length of the poly-
   protein protease, focusing on those regions that characterise members of 
   the C37 family but distinguish them from the C24 proteases - motif 3 encodes
   the active site His residue; and motif 4 includes the catalytic Asp (the Cys
   residue is located between motifs 7 and 8). Two iterations on OWL30.1 were
   required to reach convergence, at which point a true set comprising 4 
   sequences was identified.
  
   An update on SPTR37_9f identified a true set of 3 sequences.

  SUMMARY INFORMATION
      3 codes involving  9 elements
      0 codes involving  8 elements
      0 codes involving  7 elements
      0 codes involving  6 elements
      0 codes involving  5 elements
      0 codes involving  4 elements
      0 codes involving  3 elements
      0 codes involving  2 elements

   COMPOSITE FINGERPRINT INDEX
  
    9|   3    3    3    3    3    3    3    3    3  
    8|   0    0    0    0    0    0    0    0    0  
    7|   0    0    0    0    0    0    0    0    0  
    6|   0    0    0    0    0    0    0    0    0  
    5|   0    0    0    0    0    0    0    0    0  
    4|   0    0    0    0    0    0    0    0    0  
    3|   0    0    0    0    0    0    0    0    0  
    2|   0    0    0    0    0    0    0    0    0  
   --+----------------------------------------------
     |   1    2    3    4    5    6    7    8    9  

True positives..
 POLN_SOUV3     Q83883         POLN_LORDV     


  PROTEIN TITLES
   POLN_SOUV3       NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYM
   Q83883           NONSTRUCTURAL POLYPROTEIN - NORWALK VIRUS.
   POLN_LORDV       NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYM

SCAN HISTORY OWL30_2 1 20 NSINGLE SPTR37_9f 2 4 NSINGLE INITIAL MOTIF SETS SRSVCYSPTASE1 Length of motif = 20 Motif number = 1 Small round structured virus protease motif I - 1 PCODE ST INT WADDDRSVDYNEKLDFEAPP POLN_LORDV 992 992 WADDEREVDYNEKISFEAPP POLN_SOUV3 1083 1083 SRSVCYSPTASE2 Length of motif = 20 Motif number = 2 Small round structured virus protease motif II - 1 PCODE ST INT WSRIVNFGSGWGFWVSPSLF POLN_LORDV 1014 2 WSRVTKFGSGWGFWVSPTVF POLN_SOUV3 1105 2 SRSVCYSPTASE3 Length of motif = 22 Motif number = 3 Small round structured virus protease motif III - 1 PCODE ST INT ITSTHVIPQGAQEFFGVPVKQI POLN_LORDV 1034 0 ITTTHVIPTSAKEFFGEPLTSI POLN_SOUV3 1125 0 SRSVCYSPTASE4 Length of motif = 23 Motif number = 4 Small round structured virus protease motif IV - 1 PCODE ST INT IHKSGEFCRLRFPKPIRTDVTGM POLN_LORDV 1057 1 IHRAGEFTLFRFSKKIRPDLTGM POLN_SOUV3 1148 1 SRSVCYSPTASE5 Length of motif = 22 Motif number = 5 Small round structured virus protease motif V - 1 PCODE ST INT LEEGAPEGTVVTLLIKRSTGEL POLN_LORDV 1081 1 LEEGCPEGTVCSVLIKRDSGEL POLN_SOUV3 1172 1 SRSVCYSPTASE6 Length of motif = 19 Motif number = 6 Small round structured virus protease motif VI - 1 PCODE ST INT PLAARMGTHATMKIQGRTV POLN_LORDV 1104 1 PLAVRMGAIASMRIQGRLV POLN_SOUV3 1195 1 SRSVCYSPTASE7 Length of motif = 18 Motif number = 7 Small round structured virus protease motif VII - 1 PCODE ST INT GQMGMLLTGSNAKSMDLG POLN_LORDV 1124 1 GQSGMLLTGANAKGMDLG POLN_SOUV3 1215 1 SRSVCYSPTASE8 Length of motif = 20 Motif number = 8 Small round structured virus protease motif VIII - 1 PCODE ST INT YKRENDYVVIGVHTAAARGG POLN_LORDV 1153 11 YKRANDWVVCGVHAAATKSG POLN_SOUV3 1244 11 SRSVCYSPTASE9 Length of motif = 20 Motif number = 9 Small round structured virus protease motif IX - 1 PCODE ST INT NTVICATQGSEGEATLEGGD POLN_LORDV 1173 0 NTVVCAVQASEGETTLEGGD POLN_SOUV3 1264 0 FINAL MOTIF SETS SRSVCYSPTASE1 Length of motif = 20 Motif number = 1 Small round structured virus protease motif I - 2 PCODE ST INT WADDEREVDYNEKISFEAPP POLN_SOUV3 1083 1083 WADDDREVDYNEKINFEAPP Q83883 1084 1084 WADDDRSVDYNEKLDFEAPP POLN_LORDV 992 992 SRSVCYSPTASE2 Length of motif = 20 Motif number = 2 Small round structured virus protease motif II - 2 PCODE ST INT WSRVTKFGSGWGFWVSPTVF POLN_SOUV3 1105 2 WSRVTKFGSGWGFWVSPTVF Q83883 1106 2 WSRIVNFGSGWGFWVSPSLF POLN_LORDV 1014 2 SRSVCYSPTASE3 Length of motif = 22 Motif number = 3 Small round structured virus protease motif III - 2 PCODE ST INT ITTTHVIPTSAKEFFGEPLTSI POLN_SOUV3 1125 0 ITTTHVVPTGVKEFFGEPLSSI Q83883 1126 0 ITSTHVIPQGAQEFFGVPVKQI POLN_LORDV 1034 0 SRSVCYSPTASE4 Length of motif = 23 Motif number = 4 Small round structured virus protease motif IV - 2 PCODE ST INT IHRAGEFTLFRFSKKIRPDLTGM POLN_SOUV3 1148 1 IHQAGEFTQFRFSKKMRPDLTGM Q83883 1149 1 IHKSGEFCRLRFPKPIRTDVTGM POLN_LORDV 1057 1 SRSVCYSPTASE5 Length of motif = 22 Motif number = 5 Small round structured virus protease motif V - 2 PCODE ST INT LEEGCPEGTVCSVLIKRDSGEL POLN_SOUV3 1172 1 LEEGCPEGTVCSVLIKRDSGEL Q83883 1173 1 LEEGAPEGTVVTLLIKRSTGEL POLN_LORDV 1081 1 SRSVCYSPTASE6 Length of motif = 19 Motif number = 6 Small round structured virus protease motif VI - 2 PCODE ST INT PLAVRMGAIASMRIQGRLV POLN_SOUV3 1195 1 PLAVRMGAIASMRIQGRLV Q83883 1196 1 PLAARMGTHATMKIQGRTV POLN_LORDV 1104 1 SRSVCYSPTASE7 Length of motif = 18 Motif number = 7 Small round structured virus protease motif VII - 2 PCODE ST INT GQSGMLLTGANAKGMDLG POLN_SOUV3 1215 1 GQSGMLLTGANAKGMDLG Q83883 1216 1 GQMGMLLTGSNAKSMDLG POLN_LORDV 1124 1 SRSVCYSPTASE8 Length of motif = 20 Motif number = 8 Small round structured virus protease motif VIII - 2 PCODE ST INT YKRANDWVVCGVHAAATKSG POLN_SOUV3 1244 11 HKRGNDWVVCGVHAAATKSG Q83883 1245 11 YKRENDYVVIGVHTAAARGG POLN_LORDV 1153 11 SRSVCYSPTASE9 Length of motif = 20 Motif number = 9 Small round structured virus protease motif IX - 2 PCODE ST INT NTVVCAVQASEGETTLEGGD POLN_SOUV3 1264 0 NTVVCAVQAGEGETALEGGD Q83883 1265 0 NTVICATQGSEGEATLEGGD POLN_LORDV 1173 0

User query: Display/Full Code "SRSVCYSPTASE"