WORKLIST ENTRIES (1):
HTHARSR View alignment View Structure Bacterial regulatory protein ArsR family signature
Type of fingerprint: COMPOUND with 4 elements
Links:
PRINTS; PR00032 HTHARAC; PR00033 HTHASNC; PR00034 HTHCRP; PR00035 HTHGNTR
PRINTS; PR00036 HTHLACI; PR00037 HTHLACR; PR00038 HTHLUXR; PR00039 HTHLYSR
PRINTS; PR00598 HTHMARR; PR00040 HTHMERR; PR00455 HTHTETR; PR00030 HTHCRO
PRINTS; PR00031 HTHREPRESSR
INTERPRO; IPR001845
PROSITE; PS00846 HTH_ARSR_FAMILY
PFAM; PF01022 HTH_ARSR_family
Creation date 24-AUG-1997; UPDATE 27-JUN-1999
1. MORBY, A.P., TURNER, J.S., HUCKLE, J.W. AND ROBINSON, N.J.
smtB is a metal-dependent repressor of the cyanobacterial metallothionein
gene smtA - identification of a Zn-inhibited DNA-protein complex.
NUCLEIC ACIDS RES. 21 921-925 (1993).
2. BAIROCH A.
A possible mechanism for metal-ion induced DNA-protein dissociation in
a family of prokaryotic transcriptional regulators.
NUCLEIC ACIDS RES. 21 2515-2515 (1993).
Bacterial transcription regulatory proteins that bind DNA via a helix-turn-
helix (HTH) motif can be grouped into families on the basis of sequence
similarities. One such group, termed arsR, includes several proteins that
appear to dissociate from DNA in the presence of metal ions: arsR, which
functions as a transcriptional repressor of an arsenic resistance operon;
smtB from Synechococcus, which acts as a transcriptional repressor of the
smtA gene that codes for a metallothionein; cadC, a protein required for
cadmium-resistance; and hypothetical protein yqcJ from Bacillus subtilis.
The HTH motif is thought to be located in the central part of these
proteins [1]. The motif is characterised by a number of well-conserved
residues: at its N-terminal extremity is a cysteine residue; a second Cys
is found in arsR and cadC, but not in smtA; and at the C-terminus lie one
or two histidines. These residues may be involved in metal-binding (Zn in
smtB; metal-oxyanions such as arsenite, antimonite and arsenate for arsR;
and cadmium for cadC) [2]. It is believed that binding of a metal ion could
induce a conformational change that would prevent the protein from binding
DNA [2].
HTHARSR is a 4-element fingerprint that provides a signature for the
bacterial regulatory protein arsR family. The fingerprint was derived from
an initial alignment of 13 sequences: the motifs were drawn from short
conserved regions spanning the full alignment length - motifs 2 and 3
span the region encoded by PROSITE pattern HTH_ARSR_FAMILY (PS00846), which
includes the complete HTH motif. Two iterations on OWL29.4 were required
to reach convergence, at which point a true set comprising 21 sequences
was identified. Several partial matches were also found, all of which
appear to be related DNA-binding proteins that contain an HTH motif.
An update on SPTR37_9f identified a true set of 30 sequences, and 27
partial matches.
SUMMARY INFORMATION
30 codes involving 4 elements
13 codes involving 3 elements
14 codes involving 2 elements
COMPOSITE FINGERPRINT INDEX
4| 30 30 30 30
3| 13 0 13 13
2| 9 2 10 7
--+---------------------
| 1 2 3 4
True positives..
ARSR_ECOLI ARR2_ECOLI ARR1_ECOLI CADC_LISMO
P74986 CADF_STAAU O68020 O26985
P94887 O67394 CADC_STAAU SMTB_SYNY3
P73808 O27823 SMTB_SYNP7 CADC_BACFI
P71941 Q58721 ARSR_STAXY ARSR_STAAU
O52029 P96677 HLYU_VIBCH YQCJ_BACSU
O69711 Q53040 O05840 P95774
O53838 O57801
Subfamily: Codes involving 3 elements
Subfamily True positives..
O31844 O32242 O53773 P71939
NOLR_RHIME O54057 MERR_STRLI O85142
O53478 O31480 YW25_MYCTU O53626
O28144
Subfamily: Codes involving 2 elements
Subfamily True positives..
O50591 O53921 O34464 O08446
O52026 O28576 O58828 P77295
O59372 O28998 Q52517 O28971
P96683 YF53_METJA
PROTEIN TITLES
ARSR_ECOLI ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
ARR2_ECOLI ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
ARR1_ECOLI ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
CADC_LISMO CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - LISTERIA MONOCYTOG
P74986 ARSENITE INDUCIBLE REPRESSOR - YERSINIA ENTEROCOLITICA.
CADF_STAAU CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN HOMOLOG - STAPHYLOCO
O68020 ARSR - PSEUDOMONAS AERUGINOSA.
O26985 TRANSCRIPTIONAL REGULATOR - METHANOBACTERIUM THERMOAUTOTROPH
P94887 CADMIUM RESISTANCE REGULATORY PROTEIN - LACTOCOCCUS LACTIS.
O67394 TRANSCRIPTIONAL REGULATOR (ARSR FAMILY) - AQUIFEX AEOLICUS.
CADC_STAAU CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - STAPHYLOCOCCUS AUR
SMTB_SYNY3 TRANSCRIPTIONAL REPRESSOR SMTB HOMOLOG - SYNECHOCYSTIS SP. (
P73808 ARSENICAL RESISTANCE OPERON REPRESSOR - SYNECHOCYSTIS SP. (S
O27823 TRANSCRIPTIONAL REGULATOR - METHANOBACTERIUM THERMOAUTOTROPH
SMTB_SYNP7 TRANSCRIPTIONAL REPRESSOR SMTB - SYNECHOCOCCUS SP. (STRAIN P
CADC_BACFI CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - BACILLUS FIRMUS.
P71941 PUTATIVE TRANSCRIPTIONAL REGULATOR CY441.12 - MYCOBACTERIUM
Q58721 HYPOTHETICAL PROTEIN MJ1325 - METHANOCOCCUS JANNASCHII.
ARSR_STAXY ARSENICAL RESISTANCE OPERON REPRESSOR - STAPHYLOCOCCUS XYLOS
ARSR_STAAU ARSENICAL RESISTANCE OPERON REPRESSOR - STAPHYLOCOCCUS AUREU
O52029 ARSR PROTEIN - HALOBACTERIUM SP.
P96677 YDET PROTEIN - BACILLUS SUBTILIS.
HLYU_VIBCH TRANSCRIPTIONAL ACTIVATOR HLYU - VIBRIO CHOLERAE.
YQCJ_BACSU HYPOTHETICAL 12.3 KD PROTEIN IN CWLA-CISA INTERGENIC REGION
O69711 PUTATIVE REGULATORY PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
Q53040 NITRILE HYDRATASE REGULATAR 2 - RHODOCOCCUS RHODOCHROUS.
O05840 HYPOTHETICAL 14.4 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
P95774 CADX - STAPHYLOCOCCUS LUGDUNENSIS.
O53838 PUTATIVE TRANSCRIPTIONAL REGULATOR - MYCOBACTERIUM TUBERCULO
O57801 137AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
O31844 YOZA PROTEIN - BACILLUS SUBTILIS.
O32242 YVBA PROTEIN - BACILLUS SUBTILIS.
O53773 HYPOTHETICAL 46.4 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
P71939 PUTATIVE TRANSCRIPTIONAL REGULATOR CY441.10C - MYCOBACTERIUM
NOLR_RHIME NODULATION PROTEIN NOLR - RHIZOBIUM MELILOTI.
O54057 BV. VICIAE NOLR GENE - RHIZOBIUM LEGUMINOSARUM.
MERR_STRLI PROBABLE MERCURY RESISTANCE OPERON REPRESSOR - STREPTOMYCES
O85142 REPRESSOR PROTEIN - STAPHYLOCOCCUS AUREUS.
O53478 PUTATIVE REGULATOR - MYCOBACTERIUM TUBERCULOSIS.
O31480 YCZG PROTEIN - BACILLUS SUBTILIS.
YW25_MYCTU HYPOTHETICAL TRANSCRIPTIONAL REGULATOR CY39.25 - MYCOBACTERI
O53626 TRANSCRIPTIONAL REGULATOR - MYCOBACTERIUM TUBERCULOSIS.
O28144 TRANSCRIPTIONAL REGULATORY PROTEIN, ARSR FAMILY - ARCHAEOGLO
O50591 ARSR - ACIDIPHILIUM MULTIVORUM.
O53921 HYPOTHETICAL 23.8 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
O34464 YCEK - BACILLUS SUBTILIS.
O08446 HYPOTHETICAL 24.1 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
O52026 SIMILAR TO BACILLUS SUBTILIS GP:GI-1881340 - HALOBACTERIUM S
O28576 CONSERVED HYPOTHETICAL PROTEIN - ARCHAEOGLOBUS FULGIDUS.
O58828 147AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
P77295 FROM BASES 2786072 TO 2796651 (SECTION 241 OF 400) OF THE CO
O59372 253AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
O28998 TRANSCRIPTIONAL REGULATORY PROTEIN, ARSR FAMILY - ARCHAEOGLO
Q52517 HYPOTHETICAL 12.1 KD PROTEIN - STREPTOMYCES COELICOLOR.
O28971 HYPOTHETICAL 27.1 KD PROTEIN - ARCHAEOGLOBUS FULGIDUS.
P96683 YDFF PROTEIN - BACILLUS SUBTILIS.
YF53_METJA HYPOTHETICAL PROTEIN MJ1553 - METHANOCOCCUS JANNASCHII.
SCAN HISTORY
OWL29_4 2 100 NSINGLE
SPTR37_9f 3 150 NSINGLE
INITIAL MOTIF SETS
HTHARSR1 Length of motif = 16 Motif number = 1
Bacterial regulatory protein ArsR motif I - 1
PCODE ST INT
FKILSDETRLGIVLLL ARR2_ECOLI 10 10
FKILADETRLGIVLLL ARSR_ECOLI 10 10
FKNLSDETRLGIVLLL ARR1_ECOLI 10 10
FKILSDETRVKIVYAL LISCADTNP1 35 35
LKVLSDPSRLEILDLL ARSR_STAXY 10 10
LKILSDSSRLEILDLL ARSR_STAAU 10 10
FKALADQKRLEIMYEL YQCJ_BACSU 16 16
FKAFGDPTRLMILKLL D64465 11 11
FKALSDDTRVKIAYVL CADF_STAAU 35 35
FQALSDPIRLQVLTLL S74901 13 13
FAVLADPNRLRLLSLL SMTB_SYNP7 40 40
FDALADPVRRAILTVL D67028 7 7
LRALAAPVRIAIVLQL MTCY2721 52 52
HTHARSR2 Length of motif = 12 Motif number = 2
Bacterial regulatory protein ArsR motif II - 1
PCODE ST INT
ELCVCDLCTALE ARR2_ECOLI 30 4
ELCVCDLCTALD ARSR_ECOLI 30 4
ELCVCDLCMALD ARR1_ECOLI 30 4
ELCVCDLANIVE LISCADTNP1 55 4
ELCACDLLEHFQ ARSR_STAXY 29 3
ELCACDLLEHFQ ARSR_STAAU 29 3
KTCVCDLTEIFE YQCJ_BACSU 36 4
SMCVCKIIDELK D64465 31 4
ELCVCDVANIIE CADF_STAAU 55 4
EQCVCDLCDQLN S74901 32 3
ELCVGDLAQAIG SMTB_SYNP7 59 3
ECSVNELVDQID D67028 27 4
QRCVHELVDALH MTCY2721 71 3
HTHARSR3 Length of motif = 16 Motif number = 3
Bacterial regulatory protein ArsR motif III - 1
PCODE ST INT
SQPKTSRHLAMLRESG ARR2_ECOLI 43 1
SQPKISRHLALLRESG ARSR_ECOLI 43 1
SQPKISRHLAMLRESG ARR1_ECOLI 43 1
TVAATSHHLRFLKKQG LISCADTNP1 68 1
SQPTLSHHMKSLVDNE ARSR_STAXY 42 1
SQPTLSHHMKSLVDNE ARSR_STAAU 42 1
TQSKLSYHLKILLDAN YQCJ_BACSU 49 1
PQPTISHHLNILKKAG D64465 44 1
STATASHHLRLLKNLG CADF_STAAU 68 1
SQSKLSFHLKRLRDAE S74901 45 1
SESAVSHQLRSLRNLR SMTB_SYNP7 72 1
GRTGVSNHLRILRHAG D67028 41 2
PQPLVSQHLKILKAAG MTCY2721 84 1
HTHARSR4 Length of motif = 16 Motif number = 4
Bacterial regulatory protein ArsR motif IV - 1
PCODE ST INT
GLLLDRKQGKWVHYRL ARR2_ECOLI 58 -1
GLLLDRKQGKWVHYRL ARSR_ECOLI 58 -1
GILLDRKQGKWVHYRL ARR1_ECOLI 58 -1
GIANYRKDGKLVYYSL LISCADTNP1 83 -1
ELVTTRKNGNKHMYQL ARSR_STAXY 57 -1
ELVTTRKDGNKHWYQL ARSR_STAAU 57 -1
NLITKETKGTWSYYDL YQCJ_BACSU 64 -1
GIVKARKEGTWNFYYI D64465 59 -1
GIAKYRKEGKLVYYSL CADF_STAAU 83 -1
ELVHTRQDGRWIYYRL S74901 60 -1
RLVSYRKQGRHVYYQL SMTB_SYNP7 87 -1
GLVTERKAGRFRFYSI D67028 56 -1
GVVTGERSGREVLYRL MTCY2721 99 -1
FINAL MOTIF SETS
HTHARSR1 Length of motif = 16 Motif number = 1
Bacterial regulatory protein ArsR motif I - 3
PCODE ST INT
FKILADETRLGIVLLL ARSR_ECOLI 10 10
FKILSDETRLGIVLLL ARR2_ECOLI 10 10
FKNLSDETRLGIVLLL ARR1_ECOLI 10 10
FKILSDETRVKIVYAL CADC_LISMO 35 35
FKILSDETRLAIVMLL P74986 8 8
FKALSDDTRVKIAYVL CADF_STAAU 35 35
FKCLADETRVRATLLI O68020 8 8
LKALADPTRLLIIYLL O26985 42 42
FKILSDENRLKIVHAL P94887 35 35
FYALSEPKRLCMVKLL O67394 10 10
LKAIADENRAKITYAL CADC_STAAU 36 36
FSALADPSRLRLMSAL SMTB_SYNY3 50 50
FQALSDPIRLQVLTLL P73808 13 13
FKILSEPTRLKILMAL O27823 36 36
FAVLADPNRLRLLSLL SMTB_SYNP7 40 40
LKAIADENRAKITYAL CADC_BACFI 36 36
FKALADPVRLQLLSSV P71941 35 35
FKAFGDPTRLMILKLL Q58721 11 11
LKVLSDPSRLEILDLL ARSR_STAXY 10 10
LKILSDSSRLEILDLL ARSR_STAAU 10 10
LSALANETRYKIIRIL O52029 49 49
LKTLSDQTRLIMMRLF P96677 12 12
LKAMANERRLQILCML HLYU_VIBCH 25 25
FKALADQKRLEIMYEL YQCJ_BACSU 16 16
LQALATPSRLMILTQL O69711 27 27
FDALADPVRRAILTVL Q53040 7 7
LRALAAPVRIAIVLQL O05840 52 52
LEKICDEKKLKIILSL P95774 36 36
FRMLADATRVQVLWSL O53838 22 22
LKVVSNPIRYGIVKML O57801 50 50
HTHARSR2 Length of motif = 12 Motif number = 2
Bacterial regulatory protein ArsR motif II - 3
PCODE ST INT
ELCVCDLCTALD ARSR_ECOLI 30 4
ELCVCDLCTALE ARR2_ECOLI 30 4
ELCVCDLCMALD ARR1_ECOLI 30 4
ELCVCDLANIVE CADC_LISMO 55 4
EMCVCDLCGATS P74986 28 4
ELCVCDVANIIE CADF_STAAU 55 4
ELCVCELMCALA O68020 28 4
DLCVCEIMAALK O26985 61 3
ELCVCDIANIID P94887 55 4
ELCVCDFMRIFK O67394 30 4
ELCVCDIANILG CADC_STAAU 56 4
ELCVCDLAAAMK SMTB_SYNY3 69 3
EQCVCDLCDQLN P73808 32 3
SLCVCELASLLD O27823 55 3
ELCVGDLAQAIG SMTB_SYNP7 59 3
ESCVCDIANIIG CADC_BACFI 56 4
EACVCDISAGVE P71941 57 6
SMCVCKIIDELK Q58721 31 4
ELCACDLLEHFQ ARSR_STAXY 29 3
ELCACDLLEHFQ ARSR_STAAU 29 3
ELCVCEFSPLLD O52029 70 5
EYCVCQLVDMFE P96677 31 3
ELSVGELSSRLE HLYU_VIBCH 44 3
KTCVCDLTEIFE YQCJ_BACSU 36 4
PLPVTDLAEAIG O69711 46 3
ECSVNELVDQID Q53040 27 4
QRCVHELVDALH O05840 71 3
ELCVCDISLILK P95774 56 4
EMSVNELAEQVG O53838 41 3
WMCVCLIAKALD O57801 69 3
HTHARSR3 Length of motif = 16 Motif number = 3
Bacterial regulatory protein ArsR motif III - 3
PCODE ST INT
SQPKISRHLALLRESG ARSR_ECOLI 43 1
SQPKTSRHLAMLRESG ARR2_ECOLI 43 1
SQPKISRHLAMLRESG ARR1_ECOLI 43 1
TVAATSHHLRFLKKQG CADC_LISMO 68 1
SQPKISRHMAILREAE P74986 41 1
STATASHHLRLLKNLG CADF_STAAU 68 1
SQPKISRHLAQLRSAG O68020 41 1
PQPTISHHLNILRRAG O26985 74 1
SVATTSHHLNSLKKLG P94887 68 1
SQPKISFHLKVLREAG O67394 43 1
TIANASHHLRTLYKQG CADC_STAAU 69 1
SESAVSHQLRILRSQR SMTB_SYNY3 82 1
SQSKLSFHLKRLRDAE P73808 45 1
TQSAVSHQLRILRNAG O27823 68 1
SESAVSHQLRSLRNLR SMTB_SYNP7 72 1
TAANASHHLRTLHKQG CADC_BACFI 69 1
SQPTISHHLKVLRDAG P71941 70 1
PQPTISHHLNILKKAG Q58721 44 1
SQPTLSHHMKSLVDNE ARSR_STAXY 42 1
SQPTLSHHMKSLVDNE ARSR_STAAU 42 1
SDSAISHSLSQLTEAG O52029 83 1
SQPAISQHLRKLKNAG P96677 44 1
SQSALSQHLAWLRRDG HLYU_VIBCH 57 1
TQSKLSYHLKILLDAN YQCJ_BACSU 49 1
EQSAVSHQLRVLRNLG O69711 59 1
GRTGVSNHLRILRHAG Q53040 41 2
PQPLVSQHLKILKAAG O05840 84 1
SVASTSHHLRLLYKND P95774 69 1
PAPSVSQHLAKLRMAR O53838 54 1
DQTLVSHHIRILKEID O57801 82 1
HTHARSR4 Length of motif = 16 Motif number = 4
Bacterial regulatory protein ArsR motif IV - 3
PCODE ST INT
GLLLDRKQGKWVHYRL ARSR_ECOLI 58 -1
GLLLDRKQGKWVHYRL ARR2_ECOLI 58 -1
GILLDRKQGKWVHYRL ARR1_ECOLI 58 -1
GIANYRKDGKLVYYSL CADC_LISMO 83 -1
ELVLDRREGKWVHYRL P74986 56 -1
GIAKYRKEGKLVYYSL CADF_STAAU 83 -1
GLLLDRRQGQWVYYRL O68020 56 -1
GFLKAEKRGVWVHYSL O26985 89 -1
GVVDSHKDGKLVYYFI P94887 83 -1
GLVTSQKRGKWNYYRL O67394 58 -1
GVVNFRKEGKLALYSL CADC_STAAU 84 -1
RLVKYRRVGRNVYYSL SMTB_SYNY3 97 -1
ELVHTRQDGRWIYYRL P73808 60 -1
GMVDYERDGKMARYYL O27823 83 -1
RLVSYRKQGRHVYYQL SMTB_SYNP7 87 -1
GIVRYRKEGKLAFYSL CADC_BACFI 84 -1
GLLTSRRRASWVYYAV P71941 85 -1
GIVKARKEGTWNFYYI Q58721 59 -1
ELVTTRKNGNKHMYQL ARSR_STAXY 57 -1
ELVTTRKDGNKHWYQL ARSR_STAAU 57 -1
GLVTRRKDGKWRKYQT O52029 98 -1
GFVNEDRRGQWRYYSI P96677 59 -1
GLVNTRKEAQTVFYTL HLYU_VIBCH 72 -1
NLITKETKGTWSYYDL YQCJ_BACSU 64 -1
GLVVGDRAGRSIVYSL O69711 74 -1
GLVTERKAGRFRFYSI Q53040 56 -1
GVVTGERSGREVLYRL O05840 99 -1
DVLDFYKKGKMAYYFI P95774 84 -1
RLVRTRRDGTTIFYRL O53838 69 -1
DLLEEKREGKLRFYRV O57801 97 -1
User query: Display/Full Code "HTHARSR"