42
Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Embed Size (px)

Citation preview

Page 1: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Riboswitches: the oldest regulatory system?

Mikhail Gelfand

December 2004

Page 2: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Riboflavin biosynthesis pathway

ribAribA

ribA ribB

G TP cyclohydrolase II

ribD

ribD

ribG

ribG

P yrim id ine deam inase

3,4-D HB P synthase P yrim id ine reductase

ribHribH R ibo flavin synthase, -cha in

ribEribB

ypaA

R ibo flavin synthase, -chain

GTP

2,5-diam ino-6-hydroxy-4-(5`-phosphoribosylamino)pyrim idine

ribulose-5-phosphate

PENTOSE-PHOSPHATE PATHWAY

PU RINE BIO SYNTHESIS PATHWAY

3,4-dihydroxy-2-butanone-4-phosphate 5-am ino-6-(5`-phosphoribitylam ino)uracil

5-am ino-6-(5`-phosphoribosylamino)uracil

6,7-dimethyl-8-ribityllumazine

Riboflavin

Page 3: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

5’ UTR regions of riboflavin genes from various bacteria 1 2 2’ 3 Add. 3’ Variable 4 4’ 5 5’ 1’ =========> ==> <== ===> -><- <=== -> <- ====> <==== ==> <== <========= BS TTGTATCTTCGGGG-CAGGGTGGAAATCCCGACCGGCGGT 21 AGCCCGTGAC-- 8 4 8 -----TGGATTCAGTTTAA-GCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAT BQ AGCATCCTTCGGGG-TCGGGTGAAATTCCCAACCGGCGGT 19 AGTCCGTGAC-- 8 5 8 -----TGGATCTAGTGAAACTCTAGGGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATATG BE TGCATCCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGATCCGGTGCGATTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATGCC HD TTTATCCTTCGGGG-CTGGGTGGAAATCCCGACCGGCGGT 19 AGTCCGTGAC-- 10 4 10 ----–TGGACCTGGTGAAAATCCGGGACCGACAGTGAA-AGTCTGGAT-GGGAGAAGGAAACG Bam TGTATCCTTCGGGG-CTGGGTGAAAATCCCGACCGGCGGT 23 AGCCCGTGAC-- 8 4 8 ----–TGGATTCAGTGAAAAGCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAG CA GATGTTCTTCAGGG-ATGGGTGAAATTCCCAATCGGCGGT 2 AGCCCGCAA--- 3 4 3 ------AGATCCGGTTAAACTCCGGGGCCGACAGTTAA-AGTCTGGAT-GAAAGAAGAAATAG DF CTTAATCTTCGGGG-TAGGGTGAAATTCCCAATCGGCGGT 2 AGCCCGCG---- 7 6 7 --------ATTTGGTTAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GGAAGAAGATATTT SA TAATTCTTTCGGGG-CAGGGTGAAATTCCCAACCGGCAGT 6 AGCCTGCGAC-- 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGTTAA-AGTCTGGAT-GGGAGAAAGAATGT LLX ATAAATCTTCAGGG-CAGGGTGTAATTCCCTACCGGCGGT 2 AGCCCGCGA--- 4 4 4 -----ATGATTCGGTGAAACTCCGAGGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAATA PN AACTATCTTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 2 AGCCCACGA--- 3 4 3 -----ATGATTTGGTGAAATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAAAA TM AAACGCTCTCGGGG-CAGGGTGGAATTCCCGACCGGCGGT 3 AGCCCGCGAG-- 5 4 5 ----–TTGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAGAGCGTGA DR GACCTCTTTCGGGG-CGGGGCGAAATTCCCCACCGGCGGT 15 AGCCCGCGAA-- 8 12 9 ----–CCGATGCCGCGCAACTCGGCAGCCGACGGTCAC-AGTCCGGAC-GAAAGAAGGAGGAG TQ CACCTCCTTCGGGG-CGGGGTGGAAGTCCCCACCGGCGGT 3 AGCCCGCGAA-- 5 4 5 -----CCGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAAGGAGGGC AO AATAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGCGGT 2 AGTCCGCGA--- 7 7 7 -----AGGAACCGGTGAGATTCCGGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGATGAAA DU TTTAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGTGGT 2 AGTCCGCGA--- 13 4 12 -----AGGAACTAGTGAAATTCTAGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGAGCAGA CAU GAAGACCTTCGGGG-CAAGGTGAAATTCCTGATCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGACCCGGTGTGATTCCGGGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTCGGC FN TAAAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGTGGT 2 AGTCCACG---- 5 4 5 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GGGAGAAGAATTAG TFU ACGCGTGCTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT 3 AGTCCGCGAC-- 8 5 8 -----TGGAACCGGTGAAACTCCGGTACCGACGGTGAA-AGTCCGGAT-GGGAGGTAGTACGTG SX -AGCGCACTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT 3 AGTCCGCGAC-- 8 5 8 -----TTGACCAGGTGAAATTCCTGGACCGACGGTTAA-AGTCCGGAT-GGGAGGCAGTGCGCG BU GTGCGTCTTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT 30 AGCCCGCGAGCG 137 GTCAGCAGATCTGGTGAGAAGCCAGAGCCGACGGTTAG-AGTCCGGAT-GGAAGAAGATGTGC BPS GTGCGTCTTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 21 AGCCCGCGAGCG 8 4 8 GTCAGCAGATCTGGTCCGATGCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGATGTGC REU TTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 31 AGCCCGCGAGCG 7 5 7 GTCAGCAGATCTGGTGAGAGGCCAGGGCCGACGGTTAA-AGTCCGGAT-GAAAGAAGATGGGC RSO GTACGTCTTCAGGG-CGGGGTGGAATTCCCCACCGGCGGT 21 AGCCCGCGAGCG 11 3 11 GTCAGCAGATCCGGTGAGATGCCGGGGCCGACGGTCAG-AGTCCGGAT-GGAAGAAGATGTGC EC GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 17 AGCCCGCGAGCG 8 4 8 GACAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAG-AGTCCGGAT-GGGAGAGAGTAACG TY GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 67 AGCCCGCGAGCG 8 3 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGGGTAACG KP GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 20 AGCCCGCGAGCG 8 4 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGAGTAACG HI TCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 2 AGCCCACGAGCG 26 9 30 GTCAGCAGATTTGGTGAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAAAGAGAATAAAA VK GCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 14 AGCCCACGAGCG 11 9 11 GTCAGCAGATTTGGTGAGAATCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAGAATAAGC VC CAATATTCTCAGGG-CGGGGCGAAATTCCCCACCGGTGGT 13 AGCCCACGAGCG 5 4 5 GTCAGCAGATCTGGTGAGAAGCCAGGGCCGACGGTTAC-AGTCCGGAT-GAGAGAGAATGACA YP GCTTATTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 40 AGCCCGCGAGCG 16 6 16 GTCAGCAGACCCGGTGTAATTCCGGGGCCGACGGTTAT-AGTCCGGAT-GGGAGAGAGTAACG AB GCGCATTCTCAGGG-CAGGGTGAAAGTCCCTACCGGTGGT 25 AGCCCACGAGCG 16 4 27 GTCAGCAGATTTGGTGCGAATCCAAAGCCGACAGTGAC-AGTCTGGAT-GAAAGAGAATAAAA BP GTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 18 AGCCCGCGAGCG 10 4 10 GTCAGCAGACCTGGTGAGATGCCAGGGCCGACGGTCAT-AGTCCGGAT-GAGAGAAGATGTGC AC ACATCGCTTCAGGG-CGGGGCGTAATTCCCCACCGGCGGT 16 AGCCCGCGAGCA 10 3 11 ---CGCAGATCTGGTGTAAATCCAGAGCCGACGGT-AT-AGTCCGGAT-GAAAGAAGACGACG Spu AACAATTCTCAGGG-CGGGGTGAAACTCCCCACCGGCGGT 34 AGCCCGCGAGCG 6 6 6 GTCAGCAGATCTGGTG 52 TCCAGAGCCGACGGT 31 AGTCCGGAT-GGAAGAGAATGTAA PP GTCGGTCTTCAGGG-CGGGGTGTAAGTCCCCACCGGCGGT 13 AGCCCGCGAGCG 7 3 7 GTCAGCAGATCTGGTGCAACTCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGGCGTCA AU GGTTGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 17 AGCCCGCGAGCG 7 9 7 GTCAGCAGATCCGGTGAGAGGCCGGAGCCGACGGT-AT-AGTCCGGAT-GGAAGAGGACAAGG PU AAACGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 19 AGCCCGCGAGCG 19 4 18 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAC-AGTCCGGATGAAGAGAGAACGGGA PY TAACGTTCTCAGGG-CGGGGTGCAACTCCCCACCGGCGGT 19 AGCCCGCGAGCG 15 4 16 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAT-AGTCCGGATGAAGAGAGAGCGGGA PA TAACGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 19 AGCCCGCGAGCG 14 4 13 GTCAGCAGACCCGGTGCGATTCCGGGGCCGACGGTCAT-AGTCCGGATAAAGAGAGAACGGGA MLO TAAAGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 16 AGCCCGCGAGCG 8 5 8 GTCAGCAGATCCGGTGTGATTCCGGAGCCGACGGTTAG-AGTCCGGAT-GAAAGAGGACGAAA SM AAGCGTTCTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT 34 AGCCCGCGAGCG 8 3 8 GTCAGCAGATCCGGTCGAATTCCGGAGCCGACGGTTAT-AGTCCGGAT-GGAAGAGAGCAAGC BME GCTTGTTCTCGGGG-CGGGGTGAAACTCCCCACCGGCGGT 17 AGCCCGCGAGCG 10 15 10 GTCAGCAGATCCGGTGAGATGCCGGAGCCGACGGTTAA-AGTCCGGAT-GGAAGAGAGCGAAT BS ATCAATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT 18 AGCCCGCGA--- 5 4 5 -----AGGATTCGGTGAGATTCCGGAGCCGACAGT-AC-AGTCTGGAT-GGGAGAAGATGGAG BQ GTCTATCTTCGGGG-CAGGGTGAAAATCCCGACCGGCGGT 27 AGCCCGCGA—-- 3 5 3 -----AGGATTTGGTGTGATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG BE ATTCATCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGATCCGGTGCGAGTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGAAG CA AATGATCTTCAGGG-CAGGGTGAAATTCCCTACCGGCGGT 2 AGCCCGCGAG-- 3 4 3 ----TATGATCCGGTTTGATTCCGGAGCCGACAGT-AA-AGTCTGGAT-GAAAGAAGATATAT DF GAAGATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT 2 AGCCCGCG---- 6 4 6 -------GATTTGGTGAGATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAGAGAAGATATTT EF GTTCGTCTTCAGGGGCAGGGTGTAATTCCCGACCGGTGGT 3 AGTCCACGAC-- 5 3 5 ----ATTGAATTGGTGTAATTCCAATACCGACAGT-AT-AGTCTGGAT—-AAAGAAGATAGGG LLX AAATATCTTCAGGG-CACCGTGTAATTCGGGACCGGCGGT 21 ACTCCGCGAT-- 4 4 4 ----–TTGAAGCAGTGAGAATCTGCTAGCGACAGT-AA-AGTCTGGAT-GGAAGAAGATGAAC LO GTTCATCTTCGGGG-CAGGGTGCAATTCCCGACCGGTGGT 3 AGTCCACGAT-- 3 10 3 ----TTGACTCTGGTGTAATTCCAGGACCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGTTG PN AAGAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGCGGT 125 AGTCCGTG---- 3 4 3 -------GATGTGGTGAGATTCCACAACCGACAGT-AT-AGTCTGGAT-GGGAGAAGACGAAA ST AAGTGTCTTCAGGG-CAGGGTGTGATTCCCGACCGGCGGT 14 AGTCCGCG---- 3 4 3 -------GATGTGGTGTAACTCCACAACCGACAGT-AT-AGTCTGGAT-GAGAGAAGACCGGG MN AAGTGTCTTCAGGG-CAGGGTGAGATTCCCGACCGGCGGT 104 AGTCCGCG---- 3 4 3 -------GATGTGGTGAAATTCCACAACCGACAGT-AA-AGTCTGGAT-GGGAGAAGACTGAG SA ATTCATCTTCGGGG-TCGGGTGTAATTCCCAACCGGCAGT 6 AGCCTGCGAC-- 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG AMI TCACAGTTTCAGGG-CGGGGTGCAATTCCCCACTGGCGGT 14 AGCCCGCGC--- 5 5 5 ------TGATCTGGTGCAAATCCAGAGCCAACGGT-AT-AGTCCGGAT-GGAAGAAACGGAGC DHA ACGAACCTTCGAGG-TAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCAAC-- 11 4 11 --CGACTGACTTGGTGAGACTCCAAGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTACAA FN AATAATCTTCGGGG-CAGGGTGAAATTCCCGACCGGTGGT 2 AGTCCACG---- 4 6 4 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GAGAGAAGAAAAGA GLU ---TGTTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 28 AGCCCGCGAGCG 10 4 10 GTCAGCAGATCCGGTTAAATTCCGGAGCCGACGGTCAT-AGTCCGGAT-GCAAGAGAACC---

Page 4: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Conserved secondary structure of the RFN-element

NNNNyYYUC

NNNNrRRAG

NgGGNcCC

rgGGxc

ARRgxuAG

GRCCYG

AcCG

AGCCRGY

GG YRCC

GRYBy CYRVrG N

YGNaA N U U x N

Nx

AGU

UrN A g

Y

variab lestem -loop

additionalstem -loop

3 4

2

1

5

5 ’ 3 ’

u K NRA

xK

*

****

Capitals: invariant (absolutely conserved) positions.

Lower case letters: strongly conserved positions.

Dashes and stars: obligatory and facultative base pairs

Degenerate positions: R = A or G; Y = C or U; K = G or U; B= not A; V = not U. N: any nucleotide. X: any nucleotide or deletion

Page 5: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Attenuation of transcription

TerminatorThe RFN element

Antiterminator

Antiterminator

Bam GACAAAAAAATATTGATTGTATCCTTCGGGGCTGGGTG --- TCTGGATGGGAGAAGGATGA 59 ----------GTAAAGCCCCGAATGTGTAA---ACATTCGGGGCTTTTTGACGCCAAAT BS GGACAAATGAATAAAGATTGTATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGGATGA 59 ----------CTAAAGCCCCGAATTTTTTA--TAAATTCGGGGCTTTTTTGACGGTAAA BQ CTATAATTTGAGCAAACAGCATCCTTCGGGGTCGGGTG --- TCTGGATGGGAGAAGGATAT 250 -----------CCAAACCCCAAGGATATTAAA--ATCCTTGGGGTTTTTTGTTTTTTTT BE ACATAACGATATAGTGATGCATCCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGGATGC 155 ------------TGAGCCCCCGGGGACAT--------CCCGGGGGTTTCATTTTTATTG HD AAATTGAATAATTAATTTTTATCCTTCGGGGCTGGGTG --- TCTGGATGGGAGAAGGAAAC 148 -------------ATGCCCCGTGAGAACAAAA-----TCTCTGGGGCTTTTTTGCGCGC CA TAATGGTAATTTAATAGGATGTTCTTCAGGGATGGGTG --- TCTGGATGAAAGAAGAAATA 34 -------------AATCTCCGAAGGATTACC----TTTCTTTGGAGATTTTTTTATTTG DF TAAATATAAATTTAATACTTAATCTTCGGGGTAGGGTG --- TCTGGATGGAAGAAGATATT 63 ------------TAAACCCTGAGTTAATT--------CTCAGGGTTTTTTGTTTAAAAA LLX ACTTTAGCTACAATTGAATAAATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAT 127 ----------AAAAGACCCTGAAATTTT------ATTTTAGGGTCTTATTTTTTATTAG PN* ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAA 81 ----------TGTATGCCTTGAGTAGTCCCC---TATTCAAGGTATATTTTTTTGGAGG PN* ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAA 19 ------------CGTGCTCTGAAATGATTACTTGTCATTTCAGAGCATTTTTGTTAATC TM AAAACTGAATACAAAAGAAACGCTCTCGGGGCAGGGTG --- TCCGGATGGGAGAGAGCGTG 13 -----------ATGGGACCCGAGA----------------GGGTCCCTTTTCTTTTACA AO ATTTGCAACAATTTTTTAATAATCTTCAGGGCAGGGTG --- TCTGGATGGAAGAAGATGAA 33 --------TTTACAAGCCTTGAGATCGAAAG----ATTTCAAGGCTTTTTTCATCATTA DU AATTTTTTTAATACTATTTTAATCTTCAGGGCAGGGTG --- TCTGGATGGAAGAAGAAGAG 47 --------TGCATAAGCCTTGAGATCTTAG----GATTTCAAGGCTTTTTCATTAGTTA FN TAATCGAATATGTAAAATAAAGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGAATTA 18 ----------ATATTGCTCAGACTTT------------GTTTGAGCATTTTTTTATTAA SA TATAACAATTTCATATATAATTCTTTCGGGGCAGGGTG --- TCTGGATGGGAGAAAGAATG 74 ------TTTTCTCCTTGCATCTTAATT----------GATGTGAGGATTTTTGTTTATA DHA ACTCTTTTTAGATGAATACGAACCTTCGAGGTAGGGTG --- TCCGGATGGGAGAAGGTACA 43 -----------GTTTATGCCTCGAGGAACACCATTTCCTCGAGGCATTTTTGTTCTTTC FN GAAAAATAAATATTAAAAATAATCTTCGGGGCAGGGTG --- TCTGGATGAGAGAAGAAAAG 40 ------------CTTACCCGAATTCTAT------------AATTCGGTTTTTTTATTTT CA AATATAAAAAAATAAAGAATGATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATATA 19 ----------–-TATGCCCTGACGTTTTT---------CGTTGGGGCTTTTTTAATGCT DF AAAATTAAAAAATCAAAGAAGATCTTCGGGGCAGGGTG --- TCTGGATGAGAGAAGATATT 45 ----------ATAAAAACTCGAAGATAGGG----TCTTCGAGTTTTTTGTTTTTCCTAA BS TAATTAAATTTCATATGATCAATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGGA 103 --AAAGAACCTTTCCGTTTTCGAGTAAGATGTGATCGAAAAGGAGAGAATGAAGTGAAA BQ GGGAAAATAGAATATCGGTCTATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGGA 54 -------ATTCTCCCTTTGTGTAAA------------ACACAAAGGGTTTTTTCGTTCTATG BE ATAAAAATGTATAAGCGATTCATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGAA 114 --------GGCAGCCTTCTTCTTGTGAGGATGAATCACGAGAAGGGGAGGAGAACAAGCATG PN GTTTTTTGTTATGATAAAAGAGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGACGAA 137 -–AACTTCTTCTGATTTTATAG------------AAAATTGGAGGAACCTGTTATGACA ST TAAATCTGCTATGCTAGAAGTGTCTTCAGGGCAGGGTG --- TCTGGATGAGAGAAGACCGG 130 ---GGAACTTCTTTCAATTTGAAA-----------AAATTGGAGGAATTTTTTAATGTC MN ATTTTTTGATATGCTATAAGTGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGACTGA 138 ---–GGCCTTCTTTCGATTTGTAA-----------AAATTGGAGGAATTTTTTTATGAA SA AAATTTAATAATGTAAAATTCATCTTCGGGGTCGGGTG --- TCTGGATGGGAGAAGATGGA 17 --------TCCTCCTATTCTTACG--------AGATGAATGGAAGGAGAAAATTGAATATG EF AAAAAATATAATACAAGGTTCGTCTTCAGGGGCAGGGT --- GTCTGGATAAAGAAGATAGG 33 ---CTACTCTATTTTTCCCTGCAGA------------AAAATAGGGTTTTTTTGTATGA LLX TTTTTGTGCTATAATAAAAATATCTTCAGGGCACCGTG --- TCTGGATGGAAGAAGATGAA 66 -–TCAACTTCCTCGAAATTTGAAGAAT-TATTTTCTCATATTTGGAGGTTTTTTTATGT LO ATTGTAAGAAAATATTCGTTCATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGTTG 79 ---ATGCACAAACTCTCCCTCAACTTTTTTTA--------GTTGAGGTTTTTTATTTGC

Page 6: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Attenuation of translation

EC AATCCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGAGTAACG 59 ----------CTGCCCTGATTCTGGTAACCATAATTTTAGTGAGGTTTTT-------TACCATGAATCAGACGCTA TY AACCCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGGGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATGTTAATGAGGTTTTTT------TACCATGAATCAGACGCTA KP ATCTCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGAGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATTTTAATGAGGTTTTTT------TACCATGAATCAGACGCTC HI TTAGCTCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAAA 41 ----------CAGCCCTGATTCTGGTATTTAATTGAAATCTCAAAT-TAGGAAAT--TACTATGAATCAGTCAATT VK TATTTGCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAGC 76 ----------CAGCCCTGATTCTGGTATCTAAATATCTTTATATTTCAAGGAATT--TACTATGAATCAGTCTATT AB TAGGCGCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAAA 54 ----------CCGCCCTGATTCTGGTATAAATTCATCTTATTAAA—AAGGCATT---TACTATGAATCAGTCATTA YP ATGGGGCTTATTCTCAGGGCGGGGTG --- TCCGGATGGGAGAGAGTAACG 194 ----------CCGCCCTGATTCTGGTAATCCATAATTTTTTAATGAGGTTTCT---TTACCATGAATCAGACGCTT VC CACAACAATATTCTCAGGGCGGGGCG --- TCCGGATGAGAGAGAATGACA 83 ----------AAGCCCTGATTCTGGTCATTTTTT--------------GGAGTATT--ACCATGAATCAGTCCTCA Spu CTATCAACAATTCTCAGGGCGGGGTG --- TCCGGATGGAAGAGAATGTAA 145 ----------ACGCCCTGATTCTGGATATTCCCATGTCGTATTTTTGAAGGATATTAA-CCATGAATCAGTCTTTA MLO GACGTTAAAGTTCTCAGGGCGGGGTG --- TCCGGATGAAAGAGGACGAAA 44 -------CGTGCGTCCTGATTCTGGTTCGAAACGGA--------------AGGATGGACCCATGAATCAGCATTCC AC AAGCGACATCGCTTCAGGGCGGGGCG --- TCCGGATGAAAGAAGACGACG 51 ----------CAGTCCTGAAATGTTTAACCGTAATT-------------------TACGAGAGCATTTCATATGTC BP AAGCAGTACGTCTTCAGGGCGGGGTG --- TCCGGATGAGAGAAGATGTGC 62 ----------TAGCCCTGAAACGTTTTTCGCCATTTCCTTTTTT------------GCGAGAGCGTTTCAATGTCC BPS AGTCAGTGCGTCTTCAGGGCGGGGCG --- TCCGGATGAAAGAAGATGTGC 86 ----------GAGCCCTGAAACGTTTTTCGCCCATTCATGTTTC-----------GCGAGGAGCGTTTCACATCATG BU AATCAGTGCGTCTTCAGGGCGGGGTG --- GCCGGATGGAAGAAGATGTGC 99 ----------ATGCCCTGAAACGTTTTTCGCCCAACTTTT--------------GCGATGAGCGTTTCAACTATGT REU CATCGTTACGTCTTCAGGGCGGGGTG --- TCCGGATGAAAGAAGATGGGC 77 ----------ATCCCCTGAAACGCCCATCCATGGAAATCCACGCAC-------------GGAGCGTTTCAATGCTG RSO GCTTGGTACGTCTTCAGGGCGGGGTG --- TCCGGATGGAAGAAGATGTGC 80 ---------CGTGCCCTGGAACGTCTTGTCGCCCATTTCA---------------GCGAGGAGCGTTTCCATGTTG PP GGTCGGTCGGTCTTCAGGGCGGGGTG --- TCCGGATGAAAGAAGGCGTCA 50 ----------TCGCCCCGAGACGTTCATCGATCATTCA------------------CGAGGAGCGTTTCATGTTCA PY GCCGGTAACGTTCTCAGGGCGGGGTG --- CCGGATGAAGAGAGAGCGGGA 91 ----------ATGCCCTGTTTTTTCATTAAATT---------------------AAACAGGAGTCAGAACACGTGC PU CGGCGAAACGTTCTCAGGGCGGGGTG --- CCGGATGAAGAGAGAACGGGA 68 ----------ACGCCCTGTTTTTCACAC--------------------------AAACAGGAGTCAGAACATGCAA PA GGCCGTAACGTTCTCAGGGCGGGGTG --- CCGGATAAAGAGAGAACGGG 53 ---------AAAGCCCTGTTTTTCAC---------------------------GAAACAGGAGTTCGTCATATG-- BME CGCGGGCTTGTTCTCGGGGCGGGGTG --- TCCGGATGGAAGAGAGCGAAT 54 ----------GCGCCCTGATTCTAGTTTCGTG--------------------------AGGAACCTATGAACCAAA CAU AATCCGAAGACCTTCGGGGCAAGGTG --- TCCGGATGGGAGAAGGTCGGC 116 ------CGCGATGCCCCGAAGGTGTG-----------------------------TTCAGGGGTGTCGCGATGAAC TFU GTACACACGCGTGCTCCGGGGTCGGT --- GGATGGGAGGTAGTACGTGGT 58 -------GCCTTACCCCGGAGCCTGACCT-------------------------GGCTAGGGGGAAGGCTTCTCGCATG GLU TGAGTTTTGTTCTCAGGGCGGGGCG --- TCCGGATGCAAGAGAACCG 32 ---------AAGGCCCCGAGGATTACATGCTTTTAAATCCTTTGAAAAGGGGACAAGATCATGAATCCTATAACCG DR GAACCGACCTCTTTCGGGGCGGGGCG --- TCCGGACGAAAGAAGGAGGAG 1 GACGCTCAGCTTGCCCCCCA------------------------------------GCAGGCGGCGTCCGCGTATG SM GTCGCAAGCGTTCTCAGGGCGGGGTG --- TCCGGATGGAAGAGAGCAAGC 45 ATCATTGGAAAAATGCCAACCCTGAAA-------------------GGCTTGAGACCATGACCATACTT TQ TTCGGCACCTCCTTCGGGGCGGGGTG --- TCCGGATGGGAGAAGGAGGGCCACTTGCGC AMI CTTACTCACAGTTTCAGGGCGGGGTG --- TCCGGATGGAAGAAACGGAGCGCCTTATGG

SD-sequestorThe RFN element

Antisequestor

Page 7: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

RFN: the mechanism of regulation

• Transcription attenuation

• Translation attenuation

Page 8: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Distribution of RFN-elements

Genomes Number of analyzed genomes

Number of genomes with RFN

Number of the RFN elements

α-proteobacteria 8 4 4

β-proteobacteria 7 4 4

γ-proteobacteria 17 15 15

δ- and ε-proteobacteria 3 0 0

Bacillus/Clostridium 12 12 19

Actinomycetes 9 4 4

Cyanobacteria 5 0 0

Other eubacteria 7 5 6

Total 68 47 52

Page 9: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Phylogenetic tree of RFN-elements

Page 10: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

YpaA: riboflavin transporter in Gram-positive bacteria

• 5 predicted transmembrane segments => a transporter• Upstream RFN element (likely co-regulation with riboflavin

genes) => transport of riboflaving or a precursor• S. pyogenes, E. faecalis, Listeria sp.: ypaA, no riboflavin

pathway => transport of riboflavinPrediction: YpaA is riboflavin transporter (Gelfand et al., 1999)

Verification:• YpaA transports flavines (riboflavin, FMN, FAD) (by genetic

analysis, Kreneva et al., 2000)• ypaA is regulated by riboflavin (by microarray expression

study, Lee et al., 2001)• … via attenuation of transcription (and to some extent

inhibition of translaition) (Winkler et al., 2003)

Page 11: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

More predicted (riboflavin) transporters

impX from Fusobacterium and Desulfitobacterium

– no similarity with any known protein; no homologs in other complete genomes

– 9 predicted TMS

– single RFN-regulated gene

pnuX from Actinomycetes (Corynebacterium, Streptomyces, Thermomonospora)

– no orthologs in other genomes

– 6 predicted TMS

– either a single gene or a part of the riboflavin operon

– regulated by RFN

– similar to the nicotinamide mononucleotide transporter PnuC from E. coli

Page 12: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

thi-box and regulation of thiamine metabolism genes by pyrophosphate (Miranda-Rios et al., 2001)

TTCGGGATCCGCGGAACCTGA-TCAGGCTAA-TACCTGCG-AAGGGAACAAGAGTTA THIC_EC TTCGGGATCCGTTGAACCTGA-TCAGGTTAA-TACCTGCG-AAGGGAACAAGAGAAG THIC_VC GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAAGC THIC_MLO GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-CACTGGCG-TAGGGACGGTGCAGAC THIC_SM AGAAATACCCTTTACACCCGA-TCGGGATAA-TACCTGCG-TGGGGAGTTTTCACGG THIC_NM TTCTTAACCCTTTGGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGAAGTAGAGGAA thiC_BS CCGTCGACCGTACGAACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG THIC_MT GGATCGACCCTTTGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGAAATTATGTCG THIT2_TVO TCCTCGACCCCAAGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGG thi1_TM

Notation: Red– Conserved nucleotides; Green– Purine or Pyrimidine conserved nucleotides; Blue– Non-conserved nucleotides

Page 13: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Alignment of THI-elements 1 2 3 3' FACULTATIVE STEM-LOOP 2' 4 5 5' 4' 1' ----====>===> -=====> <===== ========> <======= <=== ===> =====> <===== <=== <====---- BACILLUS/CLOSTRIDIUM GROUP BS_THIC TAGTTACTGGGGGTGCCCGCT----------------TTCcgGGCTGAGAGAGAAGGCA-------------AGCTTCTTAACCCTTT---GGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGA-AGTAGAGGA BS_TENA TAACCACTAGGGGTGTCCTTC----------------ATAAGGGCTGAGATAAAAGTGT-------------GACTTTTAGACCCTCA---TAACTTGA-ACAGGTTCA-GACCTGCG-TAGGGA-AGTGGAGCG BS_YLMB TTCATCCTAGGGGTGCTTTG-------------------CGAAGCTGAGAGAGACTT-----------------TGTCTCAACCCTTT---TGACCTGA-TCTGGATCA-TGCCAGCG-GAGGGA-AGCGGTGAA BS_YKOF AAAGCACTAGGGGTGCTGT--------------------TTTGGCTGAGATAAAGCGCGGAA-----GAAACGCGCTTTGATCCCTTA---TGACCCGA-TCTGGATAA-TACCAGCG-TGGGGA-AGTGCAGGT SA_TENA GAACTACTAGGGGAGCCTAAT----------------GATATGGCTGAGATGAATT-------------------GTTCAGACCCTTA---TGACCTGA-TTTGGTTAG-TACCAACG-TAGGAA-AGTAGTTAT SA_YKOE CACACACTAGGGGTGTTT----------------------TATACTGAGATGAGGCTT---------------GCCCTCAAACCCTTT---GAACCTGA-TCTAGCTTG-AACTAGCG-TAGGAA-AGTGTTACT LLX_YUAJ TTTGCACAATGGGTCTATTGACAAA---------ACTGTCAGTAGCGAGA----------------------------AATACCATC----TGACCTGA-TCTGGGTAA-TGCCAGCG-TAGGAA-TGTGTTAAG CA_THIS ATAGTTAACGGGGAGCCTGTA-----------------GACAGGCTGAGAGTGGAATG--------------TGATTCCAGACCCTCA---TAACCTGA-TTTGGATAA-TGCCAACG-TAGGGA-GTTAATGCA CA_YUAJ TATGTGCTAGGGGTGCCTT---------------------TAGGCTGAGAAACAGTTT--------------GTCACGTTAACCCTT-----AACCTGA-TCTGGATAA-TACCAGCG-TAGGGA-AGCAGTTTG ST_YUAJ TTTCACAAAGGAGTGCTT-----------------------TGGCTGAGATCGCAA------------------TTGCGAAATCCTGA---GGACCTGA-TCTTGTTAG-TACAAGCG-TAGGGA-TTGTGACCA DHA_THIC TAATCACTAGGGGGGCCGAATA---------------AGGTCGGCTGAGATAAAGGACCCA---------AGAATCCTTTGACCCTT-----AACCTGA-TCTGGGTAA-TGCCAGCG-TAGGGAAGGTGGATAA LMO_TENA GAAAAACTAGGGGGGCCGAT-------------------TCTGGCTGAGATAGGAAGGTAAT-----------GCTTTCTGACCCTTT---GAACCTGT-TT--GTTAG-TGCAAGCG-TAGGGA-AGTGAATGT LMO_YUAJ TTACCACAGGGGGGGCTTC---------------------TTAGCTGAGATTGAGTCCACGTGT-----TTTTGGATTCTGACCCTTT---GAACCTGT-TC--GTTAA-TACGAGCG-TAGGGA-TTGTGGCGA PROTEOBACTERIA EC_THIB GTTCTCAACGGGGTGCCACGCGT------------ACGCGTGCGCTGAGAAA---------------------------ATACCCGTCGA---ACCTGA-TCCGGATAA-CGCCGGCG-AAGGGATTTGAGGC EC_THIM AAACGACTCGGGGTGCCCTTCTGC-------------GTGAAGGCTGAGAAA----------------------------TACCCGTATC---ACCTGA-TCTGGATAA-TGCCAGCG-TAGGGA-AGTCACG EC_THIC TTTCTTGTCGGAGTGCCTTA-------------------ACTGGCTGAGACCGTTT------------------ATTCGGGATCCGCGGA---ACCTGA-TCAGGCTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THIC CCACTTGTCGGAGTGCCAT---------------------TGGGCTGAGACCGTTT------------------ATTCGGGATCCGTTGA---ACCTGA-TCAGGTTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THID CCTGTAGTCGGGGAGCCTGAGAG-- 66 5 71 -AATTAAAGGCTGAGATCGCGT-------------------AGCGAGACCCGTTGA---ACCTGA-TTCAGTTAG-GACTGACG-TAGGGA-ACTATCC VC_THIB CCCACTCACGGGGGGCCACCCATTCAT-------CCGAATGGCGCTGAGATCAAGCAC---------------TGCTTGGGACCCGCA 21 -ACCTGA-ACCAGATAA-TGCTGGCG-TAGGAATTGAGCTA XFA_THIC TTTGAAGCGGGGGTACCATAGCCA------------AGCTGCGGTTGAGAC----------------------------ACACCCTTCGA---ACCTGA-TCCGGTTTA-CACCGGCG-TAGGAAAGCTTCGT MLO_THIC CATTCACCAGGGGAGTCCCGG----------------CAAGGGGCTGAGATACTGCTGGCTTTC------GCGGCGCAGTGACCCGTTGA---ACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAA MLO_THIB CGCTCTAACGGGGTGCCGGA------ 5 3 5 -----GACCGGCTGAGAGGCAGT------------------CTCGCCAACCCGCTGA---ACCTGA-TCCGGTTTG-TACCGGCG-GAGGGA-TTAGACG MLO_YK GCCCATCCACAGGGGTGCTCCGTAC-------------GGTCGGGGCTGAGACGGGGGCGG-----------CAAGCCCACAGACCCTAGA----AGCTGA-TCTGGGTAA-TACCAGCG-GAGCGA-GGCGGGCG NX_CITX CTCCTTGTCGGAGTGCCGCCGC---------------CGGGCGGCTGAGATTGCGA------------------AAGCAGAATCCGTAGA---ACCTGT--CGGGGTAA-TGCCTGCG-TAGGAA-ACAAACC NX_THIC ATTGAAACAGGGGTGCTGCCTGAT----------GTTTAGGCGGCTGAGAA----------------------------ATACCCTTTAC---ACCCGA-TCGGGATAA-TACCTGCG-TGGGGA-GTTTTCA ACTINOBACTERIAE MT_THIO CTGTAGACACGGGAGTCCCGGG--------------AGCGGGGTCTGAGAGTGGGCGCGCCT-------------GCCCTTACCGTCAC----ACCTGA-TCCGGATCA-TGCCGGCG-AAGGGAGGTCAAGGATG MT_THIC GTACCCACGCGGGAGCGCACGC--------------CGAGTGCGCTGAGAGGACGGCTCGGG------------GCCGTCGACCGTACGA---ACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG CGL_THIC CAGTCCCCACGGGCGCCCGA-----------------GCACGGGCTGAGATCGCGCTGATT---------GCTGCGCGAGCACCGTTTGA---ACCTG--TCCGGTTAG-CACCGGCG-AAGGAAGAGAGGAATGGTGCAATG CGL_THID ACTAGGCACGGGGTGCCAACCGGATGG---AAAAATTCCGGAGGCTGAGAAA---------------------------ACACCCGTTGA---ACCTGC-TCTAGCTCG-TACTAGCG-AAGGGATGGCCTTAACGTG CGL_THIE CTTACCCCACGGGTGCCCAAT---------------GCATTGGGCTGAGATTGCGCGCTGT---------TGCTGCGCGGGACCGTTCGA---ACCTG--TCTGGTTAA-CACCAGCG-AAGGAAGCGAGGATTGATTGTCCCGTG CGL_YKOE TCATAGACACGGGTGCTCGGTGA------------AAATCCGGGCTGAGATCTGGCA----------------TAGCCACGACCGTCGA----ACCTG-ATCCGGATAA-TGCCGGCG-ATAGGGAGGAAAAATATG CGL_OARX TAGTGACACGGGGTGCAAAAGCACTTT----AAAAAAGCTTTCGCTGAGATT---------------------------ACACCCGTCGA---ACCTG-ATCCAGTTAG-TACTGGCG-AAGGGACTGTCGCAT CYANOBACTERIA NPU_THIC TCCATGCTAGGGGTGCCTACAT---------------AACCAGGCTGAGATC---------------------------ACACCCTTAAC---ACCTGAGTCTGGGTAA-TACCAGCG-GAGGGAAGCTGTTTATTG CY_THIC CCATAGCTAGGGGTGTCTAGAA---------------AGCTAGGCTGAGAA----------------------------AAACCCTTAGA---ACCTGAGACTGGGTAA-TACCAGCG-GAGGGAAGCTCACCATTC AN_THIC TCCATGCTAGGGGTGCTTGCAC---------------TAACAGGCTGAGATT---------------------------ACACCCTTAAC---ACCTGAGACTGGGTAA-TACCAGCG-AAGGGAAGCTGTTTATTG THERMUS/DEINOCOCCUS, THERMOTOGALES, Fusobacterium, CFB group DR_THIB CGCGTCACCGGGGGTGCCCTGCTT------------CGGCAGCGGCTGAGAAC---------------------------ACACCCCAGGA---ACCTGA-ACCGGGTCA-TTCCGGCG-GAGGGAGTGTGATGC DR_THIC ATCGTCAACAGGGGTGCCTCCGCATA--------TGGGCCGGAGGCTGAGAGGGCAACT---------------CGGGCCTAACCCTATGA---ACCTGA-ACTGGTTAG-CACCAGCG-GAGGGA-GTGTGACG TQ_THIBGGCCGTCACCGGGGGTGCCCCA------------------AAAGGGCTGAGAGC---------------------------ATACCCTTGGA---ACCTGA-TCCGGGTCA-TGCCGGCG-TAGGGAAGGTGACGGCC TM_THI1 CCTTCCCCAGGGGGAGCTCCTAT---------------TCCGGGGCTGAGAGGAGGACGG-------------AAGTCCTCGACCCCAAGA---ACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGGA FN_THIC TATATGTACTGGGGAGCTT----------------------TGTGCTGAGATTAGAACCT------------TTTTTCTTAGACCCATAGT---ACCT-GA-TTTGGATAA-TGCCAACG-AAGGGA—GTACCA FN_THIX ACTAGTTACAAGGGAGTTAATA-----------------AATTGACTGAGAAAAGGATG--------------TGAGCCTTGACCTTTTG----ACCT-GA-TTTGGATAA-TGCCAACG-TAGGAA--GTAAA PG_THIS AGACCGCTACGGGGGTGCTTGCCG--- 4 3 4 -GATACGGCAGGCTGAGAT---------------------------AATACCCATAG---ACCT-GA-TCCGGATAA-TACCGGCG-GAGGGAT-GTAG PG_OMR ATTGGGAGAAGGGGTGCTTCCTGTA--- 3 7 3 --GTGGATGGCTGAGAAC---------------------------AAACCCTCATC---ACCT-GA-ACCGGATAA-TACCGGCG-TAGGAAA-CTCTC BX_THIS TAAAGACAAAGGGGTGCCACC------------------CGGTGGCTGAGATT---------------------------ATACCCTAAGA---ACCT-GA-TGCAGTTAG-TACTGCCG-AAGGGA—TTGTG ARCHAEA TAC_T1 GGTGTGGTGGGGGAGCTCCAT-----------------AAGGGGCTGAGAGGATCCGG---------------ATGGATCGATCCCTGGA---ACCTGA-TCCGGGTAA-TACCGGCG-GAGGGAAATTATG FAC_T1 AGTTATACCGGGGAGCTAA---------------------AATGCTGAGAGGATAA-------------------GGATCGACCCGTGCA---ACCTGA-TCCGGACAA-TACCGGCG-GAGGGAGATGGATA

Page 14: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Conserved secondary structure of the THI-element

MG

GG K

CC

C A

G G A

A G

C C U

THI-elem ent

Thi-box

1

4

5

2

C Y G G

G R C C

N U NR

UR

NG

YY

UC

RR

NAG

AG

A

G

3

GA U

GC

N

facultative stem -loop

Capitals: strongly conserved positions. Dashes and points: obligatory and facultative base pairs

Degenerate positions: R = A or G; Y = C or U; K = G or U; M= A or C; N = any nucleotide

Page 15: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

THI: the mechanism of regulation

1 ,2

1 ,2

•Thermus/Deinococcus group,•CFB group•Proteobacteria,

• Translation attenuation

•Actinobacteria,•Cyanobacteria,•Archaea

•Bacillus/Clostridium group,•Thermotoga, •Fusobacterium,•Chloroflexus

• Transcription attenuation

Page 16: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Distribution of THI-elements

Genomes Number of analyzed genomes

Number of genomes

with THI

Number of the THI elements

-proteobacteria 7 7 15

-proteobacteria 6 6 12

-proteobacteria 18 17 38

- and proteobacteria 3 1 1

The Bacillus/Clostridium group 18 18 51

Actinomycetes 9 9 25

Cyanobacteria 5 5 5

Other eubacteria 14 11 11

Archaea (Thermoplasma) 17 3 6

Total 97 77 164

Mandal et al., 2003: THI in 3’UTR (plants). THI in untranslated intron (fungi)

Page 17: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Predicted THI-regulated genes: transporters

yuaJ: predicted thiamin transporter (possibly H+-dependent)

• Found only in the Bacillus/Clostridium group;• Occurs in genomes without the thiamin pathway (Streptococci);• Has 6 predicted transmembrane segments (TMS);• Regulated by THI-elements in all cases with only one exception (E. faecalis);• In B. cereus, the thiamin uptake is coupled to proton movement (Arch Microbiol,

1977).

thiX-thiY-thiZ and ykoF-ykoE-ykoD-ykoC: predicted ATP-dependent HMP transporters

• Found in some Proteobacteria and Firmicutes;• Not found in genomes without the thiamin pathway;• Always co-occur with thiD and thiE;• In Pasteurellae, Brucella and some Gram-positive cocci, they are present without

thiC;• Regulated by THI-elements in all cases with only one exception (T. maritima);• Putative substrate-binding protein ThiY is homologous to Thi12 from yeast, known

to be involved in the biosynthesis of HMP

Page 18: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Predicted THI-regulated genes: more transporters

• thiU from P. multocida and H. influenzae belongs to the possible thiMDE-thiU operon, has 12 predicted TMS; similar to proline permease; no orthologs in other genomes

• thiV from Methylobacillus and H. volcanii clustered with thiamin genes or has THI-elements, has 13 predicted TMS , similar to the pantothenate symporter PanF from E.coli; no orthologs in other genomes

• thiW from S. pneumoniae and E. faecalis forms an operon with thiamin genes, has 5 predicted TMS; no homologs in other complete genomes

• pnuT from the CFB group of bacteria forms operon with thiamin-related genes; has 6 TMS; similar to the nicotinamide mononucleotide transporter PnuC from E.coli; no orthologs in other genomes

• cytX from Neiserria and Chloroflexus has 12 TMS, similar to the cytosine permease CodB from E. coli, forms an operon with thiamin genes in Neiserria and Pyrococcus; homologs in other genomes are not regulated by THI-elements.

• thiT1 and thiT2 from three different Thermoplasma (Archaea) are two paralogous genes; have 9 TMS; belong to the MFS family of transporters. This is the first example of THI-element-regulated genes in Archaea

Page 19: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

The PnuC family of transporters

The RFN elements

The THI elements

Page 20: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Predicted THI-regulated genes: enzymes

• thiN: non-orthologous displacement of thiESeparate gene in archaea or with thiD (in M. theroautotrophicum)Always present if ThiD is present and ThiE is absent

• tenA: gene of unknown function somehow associated with thiDFound in most firmicutes, some proteobacteria and archaea; ThiD-TenA gene fusions in some eukaryotes;Forms clusters with thiD and other THI-elements-regulated genes in most bacteria;Single tenA gene is also regulated by THI-elements in some bacteria;Not found in genomes without the thiamin pathway;Always co-occurs with the thiD and thiE genes

• tenI: gene of unknown function, thiE paralog Found in some unrelated bacteria;Forms a separate branch in the phylogenetic tree for thiE;In most bacteria, located in clusters of THI-elements-regulated genes.

• ylmB from Bacilli belongs to the ArgE/dapE/ACY1/CPG2/yscS family of metallopeptidases;regulated by the THI-elements in B. subtilis and B. halodurans, not regulated in B. cereus.

• thi-4 from Thermotoga maritima belongs to a family of putative thiamine biosynthetic enzymes from archaea and eukaryotes. Located in the one operon with thiC and thiD.

• oarX from Methylobacillus and Staphylococcus is a single THI-elements-regulated gene; belongs to the short-chain dehydrogenase/reductase (SDR) superfamily

Page 21: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Metabolic reconstruction of the thiamin biosynthesis

= thiN (confirmed)

(Gram-positive bacteria)

(Gram-negative bacteria)

Transport of HMPTransport of HET

Page 22: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

THI-elements in delta-proteobacteria: co-operative binding?

• Tandem arrangement of THI-elements upstream of the main thiamine operon thiSGHFE1 in Desulfovibrio spp.

• Tandem arrangement of glycine riboswitches in B. subtilis and V. cholerae (Mandal et al., 2004):– co-operative binding of the cofactor (glycine)– rapid activation/repression– same arrangement in all glycine riboswitches

Page 23: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

B12-box and regulation of cobalamin metabolism genes by pyrophosphate (Nou & Kadner, 2000; Ravnum &

Andersson, 2001; Nahvi et al., 2002)

• Long mRNA leader is essential for regulation of btuB by vitamin B12.

• Involvement of highly conserved B12-box rAGYCMGgAgaCCkGCcd in regulation of the cobalamin biosynthetic genes (E. coli, S. typhimurium)

• Post-transcriptional regulation: RBS-sequestering hairpin is essential for regulation of the btuB and cbiA

• Ado-CBL is an effector molecule involved in the regulation of the cobalamin biosynthesis genes

Page 24: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Conserved RNA secondary structure of the regulatory B12-element

A

A

A

AA

AA

CGd

a

aa

a

a

ktk

h

CC

c

C

C

GG

G

GGG

G

GT

M

Y

K

y

c

c G

g

g G

G

G YG

tg

g

g

gN

RN

N

NN

r

r

r

g

g C

c

c T

C

C G

CC

a

ta N

B 12 box

P 0

5' 3'

P 1

P 4 V S

B I IB I

P 5 P 6

P 2

N

A dd- I

F acultative stem- loop

A dd- I I

The group

Bacillus/Clostridium

Other taxonomic groups

-proteobacteria

base stem

CGh

G

d

yc c

C C

P 3

Page 25: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

A

A

A

AA

AA

CGd

a

aa

a

a

ktk

h

CC

c

C

C

GG

G

GGG

G

GT

M

Y

K

y

c

c G

g

g G

G

G YG

tg

g

g

gN

RN

N

NN

r

r

r

g

g C

c

c T

C

C G

CC

a

ta N

P 0

P 1

P 4 P 5 P 6

P 2

N

CGh

G

d

yc c

C C

P 3

B12-element

+Ado-CBL

Ado-CBL

pseudoknot

terminator

1 2 3

1 2

antiterminator

3

A

A

A

AA

AA

CGd

a

aa

a

a

ktk

h

CC

c

C

C

GG

G

GGG

G

GT

M

Y

K

y

c

c G

g

g G

G

G YG

tg

g

g

gN

RN

N

NN

r

r

r

g

g C

c

c T

C

C G

CC

a

ta N

P 0

P 1

P 4 P 5 P 6

P 2

N

CGh

G

d

yc c

C C

P 3

B12-element

+Ado-CBL

Ado-CBL

pseudoknot

RBS-sequestorhairpin

1 2

1 2

antisequestor

A. B.

The predicted mechanism of the B12-mediated regulation of cobalamin genes

Page 26: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

B12-element regulates cobalamin biosynthetic genes and transporters, cobalt transporters and a number of other cobalamin-related genes.

Distribution of B12-elements in bacterial genomes

Page 27: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Metabolic reconstruction of

cobalamin biosynthesis: new

enzymes and transporters

Cobalt ion transportcbiMNQO, hoxN, hupE, cbtAB, cbtC, cbtD, cbtE, cbtG, cnoABCD

Page 28: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

If a bacterial genome contains B12-dependent and B12-independent isoenzymes, the genes encoding the B12-

independent isoenzymes are regulated by B12-elements

Ribonucleotide reductasesRibonucleotide reductases

NrdJ NrdJ ((BB1212-dependent-dependent)

NrdAB/NrdDG NrdAB/NrdDG ((BB1212-independent-independent))

+ ––

–– +

+ +

Methionine synthaseMethionine synthase

MetH MetH ((BB1212-dependent-dependent))

MetEMetE((BB1212-independent-independent))

++ ––

–– ++

++ ++

B12B12 B12

Page 29: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

LYS-element: lysine riboswitch

uaAG

u

CG

P 1

5' 3'base stem

R Yr y

Gy

y

r

aa

g

u g

a a a GG

r Cr G

y G Cyk

a G ug R

C a Yu

a

Gg N

a

aA

a N

acUGC

GA

G G gaR

ru

Yy

P 2

P 5P 6

P 7

P 3P 4

Page 30: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Reconstruction of the lysine metabolism

-aspartyl-phosphate

aspartate semialdehyde

homoserine

dihydrodipicolinate

tetrahydrodipicolinate

N-acetyl-2-amino-6-ketopimelateN-succinyl-2-amino-6-ketopimelate

N-acetyl-L,L-diaminopimelateN-succinyl-L,L-diaminopimelate

L,L-diaminopimelate

meso -diaminopimelate

Lysine transport

L-aspartate

lysC,dapG,yclMlysC,thrA,metL

asd

hom

thrA,metL

dapA

dapB

dapDdapD

ykuR

dapC(argD)

ddh

patA

dapE

dapF, dal

lysA

predicted genes are boxed (pathway of acetylated intermediates in B. subtilis)

Page 31: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Regulation of lysine catabolism: the first example of an activating riboswitch

• LYS-elements upstream of pspFkamADEatoDA operon in Thermoanaerobacter tengcongensis; kamADElysE operon in Fusobacterium nucleatum– lysine catablism pathway– LYS element overlaps candidate terminator

=> acts as activator

• similar architecture of activating adenine riboswitch upstream of purine efflux pump ydhL (pbuE) in B. subtilis (Mandal and Breaker, 2004)

Page 32: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

S-box (SAM riboswitch)

g u y

c a r

NaAUGc

AP 1

5' 3'base stem

u R

CA

U

U

uGa

P 4

NaGA

g

c

GR

CA

aCcD H

Gg

UGCY

a

AA NuccN

r

N

N

G gy

C cr

P 2

G GG A

C C DC

rG

N y G A a

Ac

gg

P 3

P 5g

Page 33: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Reconstruction of the methionine metabolism

Cystathionine

Homocysteinemethyl-THF

Sulfide

CH

methylene-THF

THF

3

O-acetylhomoserine

Homoserine

Aspartate semialdehyde

Methionine

S-ribosyl-hom ocysteine

(SRH)

S-adenosyl-hom ocysteine

(SAH)

S-adenosyl-methionine

(SAM)

Methylthioribose (MTR)MTA

Threonine

metI yrhB

metC yrhAmetF

yxjH*

metK

mtnKSUVW XYZ

hom

cysH-...metB

metH

metX

metEmtn

mtn

metY

predicted genes are marked by *(transport, salvage cycle)

Page 34: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

A new family of amino acid transporters

S-box (rectangle frame)MetJ (circle frame)LYS-element (circles)Tyr-T-box (rectangles)

BC1434

FN 062 4

269.47

SON-3

CJ

CPE

LysT

MetT

TyrT

MleN

DF

CTCCB

OB

SO N-2VC-2

NM B

SON-1

VC-1

BHHP

C

TTE-nhaC

AC0744

FN0978

BL1111

CTC 00901

OB2874OB1118

NMB05 36

FN0352BC4121

EF-nhaC 1

EF-nhaC 2

PPE

LP-nha2

LP-nha1 L

L

M

G A

ELB

BS-yheL

BS-m leN

FN0650

VC2037

BC1709

SA 2292HI1107

VV21061FN207 7

BH3946

BC0373

FN14 22

BB0638

BB0637

F N1420

CTC02529SO1087

VCA0193

BT1270

C

CB

T C02520

CPE2317

FN1414

SA2117

Archaea

clostrid ia

Pasteure llaceae

malate/lactate

Page 35: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Regulation of reverse pathway Met-Cys in Clostridium acetobutylicum

ubiG yrhA

antisense transcript

Cysteine

S-adenosylmethionine

yrhB

AA

Cys-T-box S-box

sense transcript

Page 36: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Three methionine regulatory systems in Gram-positive bacteria: loss of S-box regulons

• S-boxes (riboswitch)– Bacillales– Clostridiales– the Zoo:

• Petrotoga

• actinobacteria (Streptomyces, Thermobifida)

• Chlorobium, Chloroflexus, Cytophaga

• Fusobacterium

• Deinococcus

• proteobacteria (Xanthomonas, Geobacter)

• Met-T-boxes (Met-tRNA-dependent attenuator)– Lactobacillales

• MET-boxes (transcription factor MtaR)– Streptococcales

Lact. Strep. Bac. Clostr.

ZOOMetJ, MetR in proteobacteria

Page 37: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Riboswitches in the Sargasso sea metagenome

• 125 THI-elements

• 38 LYS-elements

• 25 B12-elements

• 9 RFN-elements

• 3 S-boxes

Page 38: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Conserved structures of known riboswitches

NNNNyYYUC

NNNNrRRAG

NgGG

NcCC

Rg

GGxc G

Aux

gRRA

GRC

CYG

AcCG

AGCCRGYGG YRCC GRYBy CYRVr

G N

YGN

aA N U U x N

Nx

AGU

UrN

A gY

uK N

RA

xK

Var

Add

RFN-element

MG

GG

A

G G A

A G

C C U

THI-element

C Y G GN U N

RUR

UC

RR G

A

A

A

AA

AA

CGd

a

aa

a

a

ktk

h

CC

c

C

C

GG

G

GGG

G

GT

M

Y

K

y

c

c G

g

g G

G

G YG

tg

g

g

gN

RN

N

NN

r

r

r

g

g C

c

c T

C

C G

CC

a

ta N

B 12 box

P1

5' 3'

P2

P5 P6 P7

P3

N

base stem

CGh

G

d

yc c

C C

P4

g u y

c a r

NaAUGc

AP1

5' 3'

u R

CA

U

U

uGa

P4

NaGA

g

c

GR

CA

aCcD H

Gg

UGCY

a

AA NuccN

r

N

N

G gy

C cr

P2G GG A

C C DC

rG

N y G A a

Ac

gg

P3

P5g

AUR

UA

P1

5' 3'

C GU R

Y

CA RUAU

GG

P2

AN

U

A

C

GU N U U

A

UA

A A

G

GCC

P3

C

N G A

U

P1

P2

P3

P4

P5

P3 P2

P4

base stem base stem5' 3' 5' 3'

B12-element

base stem

S box-

base stem

G box-

Add

Add I

Add II

Add III

Var

P5

P1

uaAG

u

CG

P1

5' 3'base stem

R Yr y

Gy

y

r

aa

g

u g

aa a GG

r Cr G

y G Cyk

a G ug R

C a Yu

a

Gg N

a

aA

a N

acUGC

GA

G G gaR

r

uYy

P2

P5P6

P7

P3P4

LYS-element

Page 39: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Characterized riboswitches (more are predicted)RFN Riboflavin

biosynthesis and transport

FMN (flavin mononucleo-tide)

Bacillus/Clostridium group, proteobacteria, actinobacteria, other bacteria

THI Biosynthesis and transport of thiamin and related compounds

TPP (hiamin pyrophosphate)

Bacillus/Clostridium group, proteobacteria, actinobacteria, cyanobacteria, other bacteria, archea (thermoplasmas), plants, fungi

B12 Biosynthesis of cobalamine, transport of cobalt, cobalamin-dependent enzymes

Coenzyme B12 (adenosyl-cobalamin)

Bacillus/Clostridium group, proteobacteria, actinobacteria, cyanobacteria, spirochaetes, other bacteria

S-box Metabolism of methionine and cystein

SAM (S-adenosyl- methionine)

Bacillus/Clostridium group and some other bacteria

LYS Lysine metabolism lysine Bacillus/Clostridium group, enterobacteria, other bacteria

G-box Metabolism of purines

purines Bacillus/Clostridium group and some other bacteria

glmS Synthesis of glucosamine-6-phosphate

glucosamine-6-phosphate

Bacillus/Clostridium group

gcvT Catabolism of glycine

glycine Bacillus/Clostridium group

Page 40: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Mechanisms

UUUUUUUU

5 ’

33 ’

5 ’

Regulatory hairpin(terminator of transcription and or RBS-sequestor)/

In the case of regulation of transcription

In the case of regulation of translation

GENES

3 ’ GENES

RNA-element

A

5 ’

1 3UUUUUUUU

Antiterm inator/Antisequestor

3 ’ GENES

5 ’ 1 2

RNA-element

3 ’ GENES

B 5 ’

2 3

Antiterminator/Antisequestor

3 ’ GENES

C

5 ’

RNA-element

3 ’ GENES

12

5 ’

1 23 ’ GENES

Regulatory hairpin

+ Effector

UUUUUUUU

- Effector

2

1

gcvT: ribozyme, cleaves its mRNA (the Breaker group)

Page 41: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

Properties of riboswitches

• Direct binding of ligands• Same structure – different mechanisms• Distribution in all taxonomic groups

– diverse bacteria– archaea - thermoplasmas– eukaryotes – plants and fungi

• Lineage-specific features…• … horizontal transfer, duplications, lineage-specific loss• Correlation of the mechanism and taxonomy:

– attenuation of transcription (anti-anti-terminator) – Bacillus/Clostridium group

– attenuation of translation (anti-anti-sequestor of translation initiation) – proteobacteria

– attenuation of translation (direct sequestor of translation initiation) – actinobacteria

Page 42: Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004

• Andrei Mironov– software genome analysis, conserved RNA patterns

• Alexei Vitreschak– analysis of RNA structures

• Dmitry Rodionov– metabolic reconstruction

• Support:– Howard Hughes Medical Institute– INTAS– Russian Fund of Basic Research– Russian Academy of Sciences