WORKLIST ENTRIES (1):

HTHARSR View alignment View Structure         Bacterial regulatory protein ArsR family signature
 Type of fingerprint: COMPOUND with 4  elements
Links:
   PRINTS; PR00032 HTHARAC; PR00033 HTHASNC; PR00034 HTHCRP; PR00035 HTHGNTR
   PRINTS; PR00036 HTHLACI; PR00037 HTHLACR; PR00038 HTHLUXR; PR00039 HTHLYSR
   PRINTS; PR00598 HTHMARR; PR00040 HTHMERR; PR00455 HTHTETR; PR00030 HTHCRO
   PRINTS; PR00031 HTHREPRESSR
   INTERPRO; IPR001845
   PROSITE; PS00846 HTH_ARSR_FAMILY
   PFAM; PF01022 HTH_ARSR_family

 Creation date 24-AUG-1997; UPDATE 27-JUN-1999

   1. MORBY, A.P., TURNER, J.S., HUCKLE, J.W. AND ROBINSON, N.J.
   smtB is a metal-dependent repressor of the cyanobacterial metallothionein
   gene smtA - identification of a Zn-inhibited DNA-protein complex.
   NUCLEIC ACIDS RES. 21 921-925 (1993).

   2. BAIROCH A.
   A possible mechanism for metal-ion induced DNA-protein dissociation in
   a family of prokaryotic transcriptional regulators.
   NUCLEIC ACIDS RES. 21 2515-2515 (1993).

   Bacterial transcription regulatory proteins that bind DNA via a helix-turn-
   helix (HTH) motif can be grouped into families on the basis of sequence 
   similarities. One such group, termed arsR, includes several proteins that
   appear to dissociate from DNA in the presence of metal ions: arsR, which
   functions as a transcriptional repressor of an arsenic resistance operon;
   smtB from Synechococcus, which acts as a transcriptional repressor of the
   smtA gene that codes for a metallothionein; cadC, a protein required for
   cadmium-resistance; and hypothetical protein yqcJ from Bacillus subtilis.
  
   The HTH motif is thought to be located in the central part of these
   proteins [1]. The motif is characterised by a number of well-conserved
   residues: at its N-terminal extremity is a cysteine residue; a second Cys
   is found in arsR and cadC, but not in smtA; and at the C-terminus lie one
   or two histidines. These residues may be involved in metal-binding (Zn in
   smtB; metal-oxyanions such as arsenite, antimonite and arsenate for arsR;
   and cadmium for cadC) [2]. It is believed that binding of a metal ion could
   induce a conformational change that would prevent the protein from binding
   DNA [2]. 
  
   HTHARSR is a 4-element fingerprint that provides a signature for the
   bacterial regulatory protein arsR family. The fingerprint was derived from
   an initial alignment of 13 sequences: the motifs were drawn from short
   conserved regions spanning the full alignment length - motifs 2 and 3
   span the region encoded by PROSITE pattern HTH_ARSR_FAMILY (PS00846), which
   includes the complete HTH motif. Two iterations on OWL29.4 were required
   to reach convergence, at which point a true set comprising 21 sequences
   was identified. Several partial matches were also found, all of which
   appear to be related DNA-binding proteins that contain an HTH motif.
  
   An update on SPTR37_9f identified a true set of 30 sequences, and 27
   partial matches.

  SUMMARY INFORMATION
     30 codes involving  4 elements
     13 codes involving  3 elements
     14 codes involving  2 elements

   COMPOSITE FINGERPRINT INDEX
  
    4|  30   30   30   30  
    3|  13    0   13   13  
    2|   9    2   10    7  
   --+---------------------
     |   1    2    3    4  

True positives..
 ARSR_ECOLI     ARR2_ECOLI     ARR1_ECOLI     CADC_LISMO     
 P74986         CADF_STAAU     O68020         O26985         
 P94887         O67394         CADC_STAAU     SMTB_SYNY3     
 P73808         O27823         SMTB_SYNP7     CADC_BACFI     
 P71941         Q58721         ARSR_STAXY     ARSR_STAAU     
 O52029         P96677         HLYU_VIBCH     YQCJ_BACSU     
 O69711         Q53040         O05840         P95774         
 O53838         O57801         
Subfamily:  Codes involving 3 elements
 Subfamily True positives..
 O31844         O32242         O53773         P71939         
 NOLR_RHIME     O54057         MERR_STRLI     O85142         
 O53478         O31480         YW25_MYCTU     O53626         
 O28144         
Subfamily:  Codes involving 2 elements
 Subfamily True positives..
 O50591         O53921         O34464         O08446         
 O52026         O28576         O58828         P77295         
 O59372         O28998         Q52517         O28971         
 P96683         YF53_METJA     


  PROTEIN TITLES
   ARSR_ECOLI       ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
   ARR2_ECOLI       ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
   ARR1_ECOLI       ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
   CADC_LISMO       CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - LISTERIA MONOCYTOG
   P74986           ARSENITE INDUCIBLE REPRESSOR - YERSINIA ENTEROCOLITICA.
   CADF_STAAU       CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN HOMOLOG - STAPHYLOCO
   O68020           ARSR - PSEUDOMONAS AERUGINOSA.
   O26985           TRANSCRIPTIONAL REGULATOR - METHANOBACTERIUM THERMOAUTOTROPH
   P94887           CADMIUM RESISTANCE REGULATORY PROTEIN - LACTOCOCCUS LACTIS.
   O67394           TRANSCRIPTIONAL REGULATOR (ARSR FAMILY) - AQUIFEX AEOLICUS.
   CADC_STAAU       CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - STAPHYLOCOCCUS AUR
   SMTB_SYNY3       TRANSCRIPTIONAL REPRESSOR SMTB HOMOLOG - SYNECHOCYSTIS SP. (
   P73808           ARSENICAL RESISTANCE OPERON REPRESSOR - SYNECHOCYSTIS SP. (S
   O27823           TRANSCRIPTIONAL REGULATOR - METHANOBACTERIUM THERMOAUTOTROPH
   SMTB_SYNP7       TRANSCRIPTIONAL REPRESSOR SMTB - SYNECHOCOCCUS SP. (STRAIN P
   CADC_BACFI       CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - BACILLUS FIRMUS.
   P71941           PUTATIVE TRANSCRIPTIONAL REGULATOR CY441.12 - MYCOBACTERIUM 
   Q58721           HYPOTHETICAL PROTEIN MJ1325 - METHANOCOCCUS JANNASCHII.
   ARSR_STAXY       ARSENICAL RESISTANCE OPERON REPRESSOR - STAPHYLOCOCCUS XYLOS
   ARSR_STAAU       ARSENICAL RESISTANCE OPERON REPRESSOR - STAPHYLOCOCCUS AUREU
   O52029           ARSR PROTEIN - HALOBACTERIUM SP.
   P96677           YDET PROTEIN - BACILLUS SUBTILIS.
   HLYU_VIBCH       TRANSCRIPTIONAL ACTIVATOR HLYU - VIBRIO CHOLERAE.
   YQCJ_BACSU       HYPOTHETICAL 12.3 KD PROTEIN IN CWLA-CISA INTERGENIC REGION 
   O69711           PUTATIVE REGULATORY PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
   Q53040           NITRILE HYDRATASE REGULATAR 2 - RHODOCOCCUS RHODOCHROUS.
   O05840           HYPOTHETICAL 14.4 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
   P95774           CADX - STAPHYLOCOCCUS LUGDUNENSIS.
   O53838           PUTATIVE TRANSCRIPTIONAL REGULATOR - MYCOBACTERIUM TUBERCULO
   O57801           137AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
 
   O31844           YOZA PROTEIN - BACILLUS SUBTILIS.
   O32242           YVBA PROTEIN - BACILLUS SUBTILIS.
   O53773           HYPOTHETICAL 46.4 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
   P71939           PUTATIVE TRANSCRIPTIONAL REGULATOR CY441.10C - MYCOBACTERIUM
   NOLR_RHIME       NODULATION PROTEIN NOLR - RHIZOBIUM MELILOTI.
   O54057           BV. VICIAE NOLR GENE - RHIZOBIUM LEGUMINOSARUM.
   MERR_STRLI       PROBABLE MERCURY RESISTANCE OPERON REPRESSOR - STREPTOMYCES 
   O85142           REPRESSOR PROTEIN - STAPHYLOCOCCUS AUREUS.
   O53478           PUTATIVE REGULATOR - MYCOBACTERIUM TUBERCULOSIS.
   O31480           YCZG PROTEIN - BACILLUS SUBTILIS.
   YW25_MYCTU       HYPOTHETICAL TRANSCRIPTIONAL REGULATOR CY39.25 - MYCOBACTERI
   O53626           TRANSCRIPTIONAL REGULATOR - MYCOBACTERIUM TUBERCULOSIS.
   O28144           TRANSCRIPTIONAL REGULATORY PROTEIN, ARSR FAMILY - ARCHAEOGLO
 
   O50591           ARSR - ACIDIPHILIUM MULTIVORUM.
   O53921           HYPOTHETICAL 23.8 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
   O34464           YCEK - BACILLUS SUBTILIS.
   O08446           HYPOTHETICAL 24.1 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
   O52026           SIMILAR TO BACILLUS SUBTILIS GP:GI-1881340 - HALOBACTERIUM S
   O28576           CONSERVED HYPOTHETICAL PROTEIN - ARCHAEOGLOBUS FULGIDUS.
   O58828           147AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
   P77295           FROM BASES 2786072 TO 2796651 (SECTION 241 OF 400) OF THE CO
   O59372           253AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
   O28998           TRANSCRIPTIONAL REGULATORY PROTEIN, ARSR FAMILY - ARCHAEOGLO
   Q52517           HYPOTHETICAL 12.1 KD PROTEIN - STREPTOMYCES COELICOLOR.
   O28971           HYPOTHETICAL 27.1 KD PROTEIN - ARCHAEOGLOBUS FULGIDUS.
   P96683           YDFF PROTEIN - BACILLUS SUBTILIS.
   YF53_METJA       HYPOTHETICAL PROTEIN MJ1553 - METHANOCOCCUS JANNASCHII.

SCAN HISTORY OWL29_4 2 100 NSINGLE SPTR37_9f 3 150 NSINGLE INITIAL MOTIF SETS HTHARSR1 Length of motif = 16 Motif number = 1 Bacterial regulatory protein ArsR motif I - 1 PCODE ST INT FKILSDETRLGIVLLL ARR2_ECOLI 10 10 FKILADETRLGIVLLL ARSR_ECOLI 10 10 FKNLSDETRLGIVLLL ARR1_ECOLI 10 10 FKILSDETRVKIVYAL LISCADTNP1 35 35 LKVLSDPSRLEILDLL ARSR_STAXY 10 10 LKILSDSSRLEILDLL ARSR_STAAU 10 10 FKALADQKRLEIMYEL YQCJ_BACSU 16 16 FKAFGDPTRLMILKLL D64465 11 11 FKALSDDTRVKIAYVL CADF_STAAU 35 35 FQALSDPIRLQVLTLL S74901 13 13 FAVLADPNRLRLLSLL SMTB_SYNP7 40 40 FDALADPVRRAILTVL D67028 7 7 LRALAAPVRIAIVLQL MTCY2721 52 52 HTHARSR2 Length of motif = 12 Motif number = 2 Bacterial regulatory protein ArsR motif II - 1 PCODE ST INT ELCVCDLCTALE ARR2_ECOLI 30 4 ELCVCDLCTALD ARSR_ECOLI 30 4 ELCVCDLCMALD ARR1_ECOLI 30 4 ELCVCDLANIVE LISCADTNP1 55 4 ELCACDLLEHFQ ARSR_STAXY 29 3 ELCACDLLEHFQ ARSR_STAAU 29 3 KTCVCDLTEIFE YQCJ_BACSU 36 4 SMCVCKIIDELK D64465 31 4 ELCVCDVANIIE CADF_STAAU 55 4 EQCVCDLCDQLN S74901 32 3 ELCVGDLAQAIG SMTB_SYNP7 59 3 ECSVNELVDQID D67028 27 4 QRCVHELVDALH MTCY2721 71 3 HTHARSR3 Length of motif = 16 Motif number = 3 Bacterial regulatory protein ArsR motif III - 1 PCODE ST INT SQPKTSRHLAMLRESG ARR2_ECOLI 43 1 SQPKISRHLALLRESG ARSR_ECOLI 43 1 SQPKISRHLAMLRESG ARR1_ECOLI 43 1 TVAATSHHLRFLKKQG LISCADTNP1 68 1 SQPTLSHHMKSLVDNE ARSR_STAXY 42 1 SQPTLSHHMKSLVDNE ARSR_STAAU 42 1 TQSKLSYHLKILLDAN YQCJ_BACSU 49 1 PQPTISHHLNILKKAG D64465 44 1 STATASHHLRLLKNLG CADF_STAAU 68 1 SQSKLSFHLKRLRDAE S74901 45 1 SESAVSHQLRSLRNLR SMTB_SYNP7 72 1 GRTGVSNHLRILRHAG D67028 41 2 PQPLVSQHLKILKAAG MTCY2721 84 1 HTHARSR4 Length of motif = 16 Motif number = 4 Bacterial regulatory protein ArsR motif IV - 1 PCODE ST INT GLLLDRKQGKWVHYRL ARR2_ECOLI 58 -1 GLLLDRKQGKWVHYRL ARSR_ECOLI 58 -1 GILLDRKQGKWVHYRL ARR1_ECOLI 58 -1 GIANYRKDGKLVYYSL LISCADTNP1 83 -1 ELVTTRKNGNKHMYQL ARSR_STAXY 57 -1 ELVTTRKDGNKHWYQL ARSR_STAAU 57 -1 NLITKETKGTWSYYDL YQCJ_BACSU 64 -1 GIVKARKEGTWNFYYI D64465 59 -1 GIAKYRKEGKLVYYSL CADF_STAAU 83 -1 ELVHTRQDGRWIYYRL S74901 60 -1 RLVSYRKQGRHVYYQL SMTB_SYNP7 87 -1 GLVTERKAGRFRFYSI D67028 56 -1 GVVTGERSGREVLYRL MTCY2721 99 -1 FINAL MOTIF SETS HTHARSR1 Length of motif = 16 Motif number = 1 Bacterial regulatory protein ArsR motif I - 3 PCODE ST INT FKILADETRLGIVLLL ARSR_ECOLI 10 10 FKILSDETRLGIVLLL ARR2_ECOLI 10 10 FKNLSDETRLGIVLLL ARR1_ECOLI 10 10 FKILSDETRVKIVYAL CADC_LISMO 35 35 FKILSDETRLAIVMLL P74986 8 8 FKALSDDTRVKIAYVL CADF_STAAU 35 35 FKCLADETRVRATLLI O68020 8 8 LKALADPTRLLIIYLL O26985 42 42 FKILSDENRLKIVHAL P94887 35 35 FYALSEPKRLCMVKLL O67394 10 10 LKAIADENRAKITYAL CADC_STAAU 36 36 FSALADPSRLRLMSAL SMTB_SYNY3 50 50 FQALSDPIRLQVLTLL P73808 13 13 FKILSEPTRLKILMAL O27823 36 36 FAVLADPNRLRLLSLL SMTB_SYNP7 40 40 LKAIADENRAKITYAL CADC_BACFI 36 36 FKALADPVRLQLLSSV P71941 35 35 FKAFGDPTRLMILKLL Q58721 11 11 LKVLSDPSRLEILDLL ARSR_STAXY 10 10 LKILSDSSRLEILDLL ARSR_STAAU 10 10 LSALANETRYKIIRIL O52029 49 49 LKTLSDQTRLIMMRLF P96677 12 12 LKAMANERRLQILCML HLYU_VIBCH 25 25 FKALADQKRLEIMYEL YQCJ_BACSU 16 16 LQALATPSRLMILTQL O69711 27 27 FDALADPVRRAILTVL Q53040 7 7 LRALAAPVRIAIVLQL O05840 52 52 LEKICDEKKLKIILSL P95774 36 36 FRMLADATRVQVLWSL O53838 22 22 LKVVSNPIRYGIVKML O57801 50 50 HTHARSR2 Length of motif = 12 Motif number = 2 Bacterial regulatory protein ArsR motif II - 3 PCODE ST INT ELCVCDLCTALD ARSR_ECOLI 30 4 ELCVCDLCTALE ARR2_ECOLI 30 4 ELCVCDLCMALD ARR1_ECOLI 30 4 ELCVCDLANIVE CADC_LISMO 55 4 EMCVCDLCGATS P74986 28 4 ELCVCDVANIIE CADF_STAAU 55 4 ELCVCELMCALA O68020 28 4 DLCVCEIMAALK O26985 61 3 ELCVCDIANIID P94887 55 4 ELCVCDFMRIFK O67394 30 4 ELCVCDIANILG CADC_STAAU 56 4 ELCVCDLAAAMK SMTB_SYNY3 69 3 EQCVCDLCDQLN P73808 32 3 SLCVCELASLLD O27823 55 3 ELCVGDLAQAIG SMTB_SYNP7 59 3 ESCVCDIANIIG CADC_BACFI 56 4 EACVCDISAGVE P71941 57 6 SMCVCKIIDELK Q58721 31 4 ELCACDLLEHFQ ARSR_STAXY 29 3 ELCACDLLEHFQ ARSR_STAAU 29 3 ELCVCEFSPLLD O52029 70 5 EYCVCQLVDMFE P96677 31 3 ELSVGELSSRLE HLYU_VIBCH 44 3 KTCVCDLTEIFE YQCJ_BACSU 36 4 PLPVTDLAEAIG O69711 46 3 ECSVNELVDQID Q53040 27 4 QRCVHELVDALH O05840 71 3 ELCVCDISLILK P95774 56 4 EMSVNELAEQVG O53838 41 3 WMCVCLIAKALD O57801 69 3 HTHARSR3 Length of motif = 16 Motif number = 3 Bacterial regulatory protein ArsR motif III - 3 PCODE ST INT SQPKISRHLALLRESG ARSR_ECOLI 43 1 SQPKTSRHLAMLRESG ARR2_ECOLI 43 1 SQPKISRHLAMLRESG ARR1_ECOLI 43 1 TVAATSHHLRFLKKQG CADC_LISMO 68 1 SQPKISRHMAILREAE P74986 41 1 STATASHHLRLLKNLG CADF_STAAU 68 1 SQPKISRHLAQLRSAG O68020 41 1 PQPTISHHLNILRRAG O26985 74 1 SVATTSHHLNSLKKLG P94887 68 1 SQPKISFHLKVLREAG O67394 43 1 TIANASHHLRTLYKQG CADC_STAAU 69 1 SESAVSHQLRILRSQR SMTB_SYNY3 82 1 SQSKLSFHLKRLRDAE P73808 45 1 TQSAVSHQLRILRNAG O27823 68 1 SESAVSHQLRSLRNLR SMTB_SYNP7 72 1 TAANASHHLRTLHKQG CADC_BACFI 69 1 SQPTISHHLKVLRDAG P71941 70 1 PQPTISHHLNILKKAG Q58721 44 1 SQPTLSHHMKSLVDNE ARSR_STAXY 42 1 SQPTLSHHMKSLVDNE ARSR_STAAU 42 1 SDSAISHSLSQLTEAG O52029 83 1 SQPAISQHLRKLKNAG P96677 44 1 SQSALSQHLAWLRRDG HLYU_VIBCH 57 1 TQSKLSYHLKILLDAN YQCJ_BACSU 49 1 EQSAVSHQLRVLRNLG O69711 59 1 GRTGVSNHLRILRHAG Q53040 41 2 PQPLVSQHLKILKAAG O05840 84 1 SVASTSHHLRLLYKND P95774 69 1 PAPSVSQHLAKLRMAR O53838 54 1 DQTLVSHHIRILKEID O57801 82 1 HTHARSR4 Length of motif = 16 Motif number = 4 Bacterial regulatory protein ArsR motif IV - 3 PCODE ST INT GLLLDRKQGKWVHYRL ARSR_ECOLI 58 -1 GLLLDRKQGKWVHYRL ARR2_ECOLI 58 -1 GILLDRKQGKWVHYRL ARR1_ECOLI 58 -1 GIANYRKDGKLVYYSL CADC_LISMO 83 -1 ELVLDRREGKWVHYRL P74986 56 -1 GIAKYRKEGKLVYYSL CADF_STAAU 83 -1 GLLLDRRQGQWVYYRL O68020 56 -1 GFLKAEKRGVWVHYSL O26985 89 -1 GVVDSHKDGKLVYYFI P94887 83 -1 GLVTSQKRGKWNYYRL O67394 58 -1 GVVNFRKEGKLALYSL CADC_STAAU 84 -1 RLVKYRRVGRNVYYSL SMTB_SYNY3 97 -1 ELVHTRQDGRWIYYRL P73808 60 -1 GMVDYERDGKMARYYL O27823 83 -1 RLVSYRKQGRHVYYQL SMTB_SYNP7 87 -1 GIVRYRKEGKLAFYSL CADC_BACFI 84 -1 GLLTSRRRASWVYYAV P71941 85 -1 GIVKARKEGTWNFYYI Q58721 59 -1 ELVTTRKNGNKHMYQL ARSR_STAXY 57 -1 ELVTTRKDGNKHWYQL ARSR_STAAU 57 -1 GLVTRRKDGKWRKYQT O52029 98 -1 GFVNEDRRGQWRYYSI P96677 59 -1 GLVNTRKEAQTVFYTL HLYU_VIBCH 72 -1 NLITKETKGTWSYYDL YQCJ_BACSU 64 -1 GLVVGDRAGRSIVYSL O69711 74 -1 GLVTERKAGRFRFYSI Q53040 56 -1 GVVTGERSGREVLYRL O05840 99 -1 DVLDFYKKGKMAYYFI P95774 84 -1 RLVRTRRDGTTIFYRL O53838 69 -1 DLLEEKREGKLRFYRV O57801 97 -1

User query: Display/Full Code "HTHARSR"