WORKLIST ENTRIES (1):

MAJORURINARY View alignment View Structure    Rodent urinary protein signature
 Type of fingerprint: COMPOUND with 7  elements
Links:
   PRINTS; PR00179 LIPOCALIN; PR00178 FATTYACIDBP
   INTERPRO; IPR002971
   PDB; 1MUP 3Dinfo
   SCOP; 1MUP
   CATH; 1MUP

 Creation date 30-NOV-1999

   1. PERVAIS, S. AND BREW, K. 
   Homology of beta-lactoglobulin, serum retinol-binding protein and 
   protein HC.
   SCIENCE 228 335-337 (1985).

   2. FLOWER, D.R.
   The Lipocalin protein family: structure and function.
   BIOCHEM.J. 318 1-14 (1996).

   3. FLOWER, D.R., NORTH, A.C.T. AND ATTWOOD, T.K.
   Structural and sequence relationships in the lipocalins and related
   proteins.
   PROTEIN SCI. 2 753-761 (1993). 

   4. FLOWER, D.R.
   Multiple molecular recognition properties of the lipocalin protein family.
   J.MOL.REC. 8 185-195 (1995).

   5. BOCSKEI, Z., GROOM, C.R., FLOWER, D.R., WRIGHT, C.E., PHILLIPS, S.E.V., 
   CAVAGGIONI, A., FINDLAY, J.B.C. AND NORTH, A.C.T.
   Pheromone Binding to Two Rodent Urinary Proteins Revealed by X-Ray
   Crystallography.
   NATURE 360 186-190 (1992).

   The lipocalins are a diverse, interesting, yet poorly understood family of 
   proteins composed, in the main, of extracellular ligand-binding proteins
   displaying high specificity for small hydrophobic molecules [1,2]. Functions
   of these proteins include transport of nutrients, control of cell regula-
   tion, pheromone transport, cryptic colouration and the enzymatic synthesis
   of prostaglandins.
    
   The crystal structures of several lipocalins have been solved and show a 
   novel 8-stranded anti-parallel beta-barrel fold well conserved within the
   family. Sequence similarity within the family is at a much lower level and
   would seem to be restricted to conserved disulphides and 3 motifs, which
   form a juxtaposed cluster that may act as a common cell surface receptor
   site [2]. By contrast, at the more variable end of the fold are found an 
   internal ligand binding site and a putative surface for the formation of 
   macromolecular complexes [4]. The anti-parallel beta-barrel fold is also
   exploited by the fatty acid-binding proteins (which function similarly by
   binding small hydrophobic molecules), by avidin and the closely related
   metalloprotease inhibitors, and by triabin. Similarity at the sequence 
   level, however, is less obvious, being confined to a single short 
   N-terminal motif.
   
   The lipocalin family can be subdivided into kernal and outlier sets. The
   kernal lipocalins form the largest self-consistent group (see LIPOCALIN
   signature). The outlier lipocalins form several smaller distinct subgroups: 
   the OBPs, the von Ebner's gland proteins, alpha-1-acid glycoproteins, 
   tick histamine binding proteins and the nitrophorins.
  
   Rodent urinary proteins (mouse MUPs and rat alpha-2u globulins) are the  
   major protein components of rodent urine and transport pheromones [5].
   Rodent urine contains an unusually large amount of protein; this phenomenon 
   has been studied extensively in both rats and mice. The major site of MUP
   synthesis is the liver; the protein is secreted by the liver into serum, 
   where it circulates at relatively low levels before being rapidly filtered
   by the kidney and excreted. Expression of MUP mRNA is under different 
   developmental and hormonal control in different tissues. However, 
   constitutive expression of major urinary protein has been demonstrated in 
   the salivary and lachrymal glands.
  
   The sex-dependent expression of MUP (adult male mice secrete 5-20 times 
   as much MUP as do females) and its ability to bind a number of odorant 
   molecules is consistent with the suggestion that MUP acts as a pheromone
   transporter; the protein may be excreted into the urine carrying a bound
   pheromone, which is released as the urine dries and the protein denatures.
   This proposal is strongly supported by the work of Bacchini and colleagues,
   who have successfully purified MUP from mouse urine with bound ligands. They
   identified three components from the total ligand extracted from the 
   purified protein: the largest proportion (~70%) was 2-(s-butyl)thiazoline,
   with 2,3-dehydroexobrevicomin and 4-(ethyl)phenol comprising minor fractions
   of ~15% each. However, only ~40% of protein contained bound ligand. The
   first two of these compounds are known to have pheromone activity in male
   rat urine, eliciting many sexually related responses in female rats. A 
   recent report has shown that MUP, acting via the vomeronasal organ after 
   appropriate physical contact with male mouse urine in their environment, can
   accelerate the onset of puberty in female mice. Interestingly, this seems to
   be a function of the protein itself; MUP devoid of ligands, either by 
   extraction or competitive displacement, is still active, while an organic 
   extract containing these volatile ligands shows no activity. Moreover, a 
   peptide corresponding to the N-terminus of MUP is also active, suggesting 
   that MUP is not only a carrier of pheromones, but also a pheromone itself.
   The crystal structure of MUP has been solved [5] and is known to be a 
   member of the lipocalin family. 
  
   Alpha-2u-globulin, a close homologue of MUP, accounts for 30-50% of total
   excreted protein in adult male rat urine. As its electrophoretic mobility
   is similar to that of serum a2 globulin, it was named 'alpha-2u-globulin',
   the subscript 'u' denoting its origin in urine. Alpha-2u-globulin is 
   secreted into the plasma by a number of tissues, where it circulates before
   filtration through the kidney; between 20 and 50% is reabsorbed by the
   proximal tubule of the nephron, the rest being excreted. Although the exact
   physiological role of alpha-2u-globulin is unclear, there is circumstantial
   evidence that it functions in pheromone transport. This is consistent with
   its observed binding properties, its close similarity with MUP and the known
   properties of male rat urine. For example, acute exposure to many important
   industrial and environmental chemicals, including components of unleaded 
   petrol, causes a toxic syndrome, known as a2u globulin nephropathy, in the
   kidney of adult male rats. This syndrome is characterised by an excessive
   accumulation, in proximal-tubule epithelial cells, of lysosomal protein
   droplets composed of large amounts of alpha-2u-globulin, and the degeneration
   and necrosis of cells lining the proximal tubule. Chronic exposure leads to
   an escalating progression of symptoms, often resulting in kidney failure and
   death. 
   
   MAJORURINARY is a 7-element fingerprint that provides a signature for rodent
   urinary proteins. The fingerprint was derived from an initial alignment of
   4 sequences: the motifs were drawn from conserved regions spanning virtually
   the full alignment length - motif 1 covers the N-terminal peptide and 310
   helix; motif 2, which includes the region encoded by PROSITE pattern
   LIPOCALIN (PS00213) and corresponds to the first LIPOCALIN fingerprint motif,
   spans the first beta-strand; motif 3 spans the distal region of the large
   loop L1 and strands 2 and 3; motif 4 covers strand 4 and the anterior region
   of strand 5; motif 5, which spans the C-terminus of strand 6 and strand 7,
   corresponds to the second motif of the LIPOCALIN fingerprint; motif 6, which
   spans strand 8 and the N-terminus of the main C-terminal alpha-helix, is 
   similar to the third LIPOCALIN motif; and motif 7 spans the C-terminal
   peptide and includes the short beta-strand 9. Two iterations on SPRT37_10f
   were required to reach convergence, at which point a true set comprising 11
   sequences was identified. A single partial match was also found, horse 
   allergen 1 protein, which matches motifs 2, 4, 6 and 7.

  SUMMARY INFORMATION
     11 codes involving  7 elements
      0 codes involving  6 elements
      0 codes involving  5 elements
      1 codes involving  4 elements
      0 codes involving  3 elements
      0 codes involving  2 elements

   COMPOSITE FINGERPRINT INDEX
  
    7|  11   11   11   11   11   11   11  
    6|   0    0    0    0    0    0    0  
    5|   0    0    0    0    0    0    0  
    4|   0    1    0    1    0    1    1  
    3|   0    0    0    0    0    0    0  
    2|   0    0    0    0    0    0    0  
   --+------------------------------------
     |   1    2    3    4    5    6    7  

True positives..
 MUP1_MOUSE     MUP6_MOUSE     MUP2_MOUSE     Q61921         
 MUP5_MOUSE     MUP_RAT        Q63024         Q63025         
 MUP4_MOUSE     Q63213         MUPM_MOUSE     
Subfamily:  Codes involving 4 elements
 Subfamily True positives..
 ALL1_HORSE     


  PROTEIN TITLES
   MUP1_MOUSE       MAJOR URINARY PROTEIN 1 PRECURSOR (MUP 1) - MUS MUSCULUS (MO
   MUP6_MOUSE       MAJOR URINARY PROTEIN 6 PRECURSOR (MUP 6) (ALPHA-2U-GLOBULIN
   MUP2_MOUSE       MAJOR URINARY PROTEIN 2 PRECURSOR (MUP 2) - MUS MUSCULUS (MO
   Q61921           MAJOR URINARY PROTEIN - MUS MUSCULUS (MOUSE).
   MUP5_MOUSE       MAJOR URINARY PROTEIN 5 PRECURSOR (MUP 5) - MUS MUSCULUS (MO
   MUP_RAT          MAJOR URINARY PROTEIN PRECURSOR (MUP) (ALPHA-2U-GLOBULIN) (1
   Q63024           RAT ALPHA-2U-GLOBULIN (L TYPE) - RATTUS NORVEGICUS (RAT).
   Q63025           RAT ALPHA-2U-GLOBULIN (S TYPE) - RATTUS NORVEGICUS (RAT).
   MUP4_MOUSE       MAJOR URINARY PROTEIN 4 PRECURSOR (MUP 4) - MUS MUSCULUS (MO
   Q63213           ALPHA-2U GLOBULIN (RAT SALIVARY GLAND (ALPHA)2(MU) GLOBULIN,
   MUPM_MOUSE       MINOR MAJOR URINARY PROTEIN 15 PRECURSOR (NON-GROUP 1/GROUP 
 
   ALL1_HORSE       MAJOR ALLERGEN EQU C 1 PRECURSOR - EQUUS CABALLUS (HORSE).

SCAN HISTORY SPTR37_10f 2 15 NSINGLE INITIAL MOTIF SETS MAJORURINARY1 Length of motif = 15 Motif number = 1 Rodent urinary protein motif I - 1 PCODE ST INT HAEEASFERGNLDVD Q63213 18 18 HAEESSSMERNFNVE MUPM_MOUSE 21 21 HAEEASSTGRNFNVE Q61921 15 15 HAEEATSKGQNLNVE MUP4_MOUSE 15 15 MAJORURINARY2 Length of motif = 19 Motif number = 2 Rodent urinary protein motif II - 1 PCODE ST INT NGDWFSIVVASDKREKIEE Q63213 35 2 SGYWFSIAEASYEREKIEE MUPM_MOUSE 38 2 NGEWHTIILAFDKREKIED Q61921 32 2 NGEWFSILLASDKREKIEE MUP4_MOUSE 32 2 MAJORURINARY3 Length of motif = 22 Motif number = 3 Rodent urinary protein motif III - 1 PCODE ST INT IDVLENSLGFTFRIKENGVCTE Q63213 64 10 ITVLENSLVFKFHLIVNEECTE MUPM_MOUSE 67 10 IHVLENSLVLKFHTVRDEECSE Q61921 61 10 IHVLENSLAFKFHTVIDGECSE MUP4_MOUSE 61 10 MAJORURINARY4 Length of motif = 13 Motif number = 4 Rodent urinary protein motif IV - 1 PCODE ST INT GEYFVEYDGENTF Q63213 97 11 GIYYMNYDGFNTF MUPM_MOUSE 100 11 GEYSVTYDGFNTF Q61921 94 11 GEYSVMYDGFNTF MUP4_MOUSE 94 11 MAJORURINARY5 Length of motif = 16 Motif number = 5 Rodent urinary protein motif V - 1 PCODE ST INT ILKTDYDNYVMFHLVN Q63213 111 1 ILKTDYDNYIMIHLIN MUPM_MOUSE 114 1 IPKTDYDNFLMAHLIN Q61921 108 1 ILKTDYDNYIMFHLIN MUP4_MOUSE 108 1 MAJORURINARY6 Length of motif = 22 Motif number = 6 Rodent urinary protein motif VI - 1 PCODE ST INT TFQLMELYGRTKDLSSDIKEKF Q63213 132 5 TFQLMELYGREPDLSLDIKEKF MUPM_MOUSE 135 5 TFQLMGLYGREPDLSSDIKERF Q61921 129 5 TFQLMELYGRKADLNSDIKEKF MUP4_MOUSE 129 5 MAJORURINARY7 Length of motif = 18 Motif number = 7 Rodent urinary protein motif VII - 1 PCODE ST INT HGITRDNIIDLTKTDRCL Q63213 160 6 HGIIRENIIDLTNVNRCL MUPM_MOUSE 163 6 HGILRENIIDLSNANRCL Q61921 157 6 HGIIKENIIDLTKTNRCL MUP4_MOUSE 157 6 FINAL MOTIF SETS MAJORURINARY1 Length of motif = 15 Motif number = 1 Rodent urinary protein motif I - 2 PCODE ST INT HAEEASSTGRNFNVE MUP1_MOUSE 17 17 HAEEASSTGRNFNVE MUP6_MOUSE 17 17 HAEEASSTGRNFNVE MUP2_MOUSE 17 17 HAEEASSTGRNFNVE Q61921 15 15 HAEEASSERQNFNVE MUP5_MOUSE 17 17 HAEEASSTRGNLDVA MUP_RAT 18 18 HAEEASSTRGNLDVD Q63024 18 18 HAEEASSTRGNLDVD Q63025 18 18 HAEEATSKGQNLNVE MUP4_MOUSE 15 15 HAEEASFERGNLDVD Q63213 18 18 HAEESSSMERNFNVE MUPM_MOUSE 21 21 MAJORURINARY2 Length of motif = 19 Motif number = 2 Rodent urinary protein motif II - 2 PCODE ST INT NGEWHTIILASDKREKIED MUP1_MOUSE 34 2 NGEWHTIILASDKREKIED MUP6_MOUSE 34 2 NGEWHTIILASDKREKIED MUP2_MOUSE 34 2 NGEWHTIILAFDKREKIED Q61921 32 2 NGKWFSILLASDKREKIEE MUP5_MOUSE 34 2 NGDWFSIVVASNKREKIEE MUP_RAT 35 2 NGDWFSIVVASDKREKIEE Q63024 35 2 NGDWFSIVVASDKREKIEE Q63025 35 2 NGEWFSILLASDKREKIEE MUP4_MOUSE 32 2 NGDWFSIVVASDKREKIEE Q63213 35 2 SGYWFSIAEASYEREKIEE MUPM_MOUSE 38 2 MAJORURINARY3 Length of motif = 22 Motif number = 3 Rodent urinary protein motif III - 2 PCODE ST INT IHVLENSLVLKFHTVRDEECSE MUP1_MOUSE 63 10 IHVLENSLVLKFHTVRDEECSE MUP6_MOUSE 63 10 IHVLEKSLVLKFHTVRDEECSE MUP2_MOUSE 63 10 IHVLENSLVLKFHTVRDEECSE Q61921 61 10 IDVLENSLAFKFHTVIDEECTE MUP5_MOUSE 63 10 IDVLENSLGFKFRIKENGECRE MUP_RAT 64 10 IDVLENSLGFKFRIKENGECRE Q63024 64 10 IDVLENSLGFKFRIKENGECRE Q63025 64 10 IHVLENSLAFKFHTVIDGECSE MUP4_MOUSE 61 10 IDVLENSLGFTFRIKENGVCTE Q63213 64 10 ITVLENSLVFKFHLIVNEECTE MUPM_MOUSE 67 10 MAJORURINARY4 Length of motif = 13 Motif number = 4 Rodent urinary protein motif IV - 2 PCODE ST INT GEYSVTYDGFNTF MUP1_MOUSE 96 11 GEYSVTYDGFNTF MUP6_MOUSE 96 11 GEYSVTYDGFNTF MUP2_MOUSE 96 11 GEYSVTYDGFNTF Q61921 94 11 GEYSVTYDGFNTF MUP5_MOUSE 96 11 GEYFVEYDGGNTF MUP_RAT 97 11 GEYFVEYDGGNTF Q63024 97 11 GEYFVEYDGGNTF Q63025 97 11 GEYSVMYDGFNTF MUP4_MOUSE 94 11 GEYFVEYDGENTF Q63213 97 11 GIYYMNYDGFNTF MUPM_MOUSE 100 11 MAJORURINARY5 Length of motif = 16 Motif number = 5 Rodent urinary protein motif V - 2 PCODE ST INT IPKTDYDNFLMAHLIN MUP1_MOUSE 110 1 IPKTDYDNFLMAHLIN MUP6_MOUSE 110 1 IPKTDYDNFLMAHLIN MUP2_MOUSE 110 1 IPKTDYDNFLMAHLIN Q61921 108 1 ILKTDYDNYIMFHLIN MUP5_MOUSE 110 1 ILKTDYDRYVMFHLIN MUP_RAT 111 1 ILKTDYDRYVMFHLIN Q63024 111 1 ILKTDYDRYVMFHLIN Q63025 111 1 ILKTDYDNYIMFHLIN MUP4_MOUSE 108 1 ILKTDYDNYVMFHLVN Q63213 111 1 ILKTDYDNYIMIHLIN MUPM_MOUSE 114 1 MAJORURINARY6 Length of motif = 22 Motif number = 6 Rodent urinary protein motif VI - 2 PCODE ST INT TFQLMGLYGREPDLSSDIKERF MUP1_MOUSE 131 5 TFQLMGLYGREPDLMSDIKERF MUP6_MOUSE 131 5 TFQLMGLYGREPDLSSDIKERF MUP2_MOUSE 131 5 TFQLMGLYGREPDLSSDIKERF Q61921 129 5 NFQLMELFGREPDLSSDIKEKF MUP5_MOUSE 131 5 TFQLMVLYGRTKDLSSDIKEKF MUP_RAT 132 5 TFQAMVLYGRTKDLSSDIKEKF Q63024 132 5 TFQAMVLYGRTKDLSSDIKEKF Q63025 132 5 TFQLMELYGRKADLNSDIKEKF MUP4_MOUSE 129 5 TFQLMELYGRTKDLSSDIKEKF Q63213 132 5 TFQLMELYGREPDLSLDIKEKF MUPM_MOUSE 135 5 MAJORURINARY7 Length of motif = 18 Motif number = 7 Rodent urinary protein motif VII - 2 PCODE ST INT HGILRENIIDLSNANRCL MUP1_MOUSE 159 6 HGILRENIIDLSNANRCL MUP6_MOUSE 159 6 HGILRENIIDLSNANRCL MUP2_MOUSE 159 6 HGILRENIIDLSNANRCL Q61921 157 6 HGIVRENIIDLSNANRCL MUP5_MOUSE 159 6 HGITRDNIIDLTKTDRCL MUP_RAT 160 6 HGITRDNIIDLTKTDHCL Q63024 160 6 HGITRDNIIDLTKTDHCL Q63025 160 6 HGIIKENIIDLTKTNRCL MUP4_MOUSE 157 6 HGITRDNIIDLTKTDRCL Q63213 160 6 HGIIRENIIDLTNVNRCL MUPM_MOUSE 163 6

User query: Display/Full Code "MAJORURINARY"