28
Figure Legends for Supplementary Materials Figure S1 Complete amino acid alignment of 88 Arabidopsis UGTs. Nine conserved motifs are numbered at the bottom of the alignment and highlighted in shaded areas. 1

Figure Legends for Supplementary Materials … · Figure Legends for Supplementary Materials Figure S1 Complete amino acid alignment of 88 Arabidopsis UGTs. Nine conserved motifs

Embed Size (px)

Citation preview

Figure Legends for Supplementary Materials

Figure S1 Complete amino acid alignment of 88 Arabidopsis UGTs. Nine conserved motifs are numbered at the

bottom of the alignment and highlighted in shaded areas.

1

1 100 ugt73c3 MATEKTHQ.. .FHP.SLHFV LFPFMAQG.H MIPMIDIARL LAQR...G.. VTITIVTTPH NAARF.KNVL NRA.....IE SGLA..INIL HVKFPYQEFG ugt73c4 MASEKSHK.. .VHP.PLHFI LFPFMAQG.H MIPMIDIARL LAQR...G.. ATVTIVTTRY NAGRF.ENVL SRA.....ME SGLP..INIV HVNFPYQEFG ugt73c2 MAFEKTRQ.. .FLP.PLHFV LFPFMAQG.H MIPMVDIARI LAQR...G.. VTITIVTTPH NAARF.KDVL NRA.....IQ SGLH..IRVE HVKFPFQEAG ugt73c5 MVSETTK... ..SS.PLHFV LFPFMAQG.H MIPMVDIARL LAQR...G.. VIITIVTTPH NAARF.KNVL NRA.....IE SGLP..INLV QVKFPYLEAG ugt73c6 MAFEKNN... ..EPFPLHFV LFPFMAQG.H MIPMVDIARL LAQR...G.. VLITIVTTPH NAARF.KNVL NRA.....IE SGLP..INLV QVKFPYQEAG ugt73c1 MASE...... .FRP.PLHFV LFPFMAQG.H MIPMVDIARL LAQR...G.. VTITIVTTPQ NAGRF.KNVL SRA.....IQ SGLP..INLV QVKFPSQESG ugt73c7 MCSH.D.... .....PLHFV VIPFMAQG.H MIPLVDISRL LSQRQ..G.. VTVCIITTTQ NVAKI.KTSL SFS.....SL FAT...INIV EVKFLSQQTG ugt73d1 MESKIVS... ..KAKRLHFV LIPLMAQG.H LIPMVDISKI LARQ...G.. NIVTIVTTPQ NASRF.AKTV DRA..R..LE SGLE..INVV KFPIPYKEFG ugt73b4 MNRE...... .....QIHIL FFPFMAHG.H MIPLLDMAKL FARR...G.. AKSTLLTTPI NAKIL.EKPI EAF..KV.QN PDLE..IGIK ILNFPCVELG ugt73b5 MNREV..... ..SE.RIHIL FFPFMAQG.H MIPILDMAKL FSRR...G.. AKSTLLTTPI NAKIF.EKPI EAF..KN.QN PDLE..IGIK IFNFPCVELG ugt73b2 MGSDHH.... ..HR.KLHVM FFPFMAYG.H MIPTLDMAKL FSSR...G.. AKSTILTTSL NSKIL.QKPI DTF..KN.LN PGLE..IDIQ IFNFPCVELG ugt73b3 MSSDP..... ..HR.KLHVV FFPFMAYG.H MIPTLDMAKL FSSR...G.. AKSTILTTPL NSKIF.QKPI ERF..KN.LN PSFE..IDIQ IFDFPCVDLG ugt73b1 MGTPVE.... ..VS.KLHFL LFPFMAHG.H MIPTLDMAKL FATK...G.. AKSTILTTPL NAKLFFEKPI KSF..NQ.DN PGLED.ITIQ ILNFPCTELG ugt90a2 MEL....... ....EKVHVV LFPYLSKG.H MIPMLQLARL LLSHSFAG.D ISVTVFTTPL NRPFI.VDSL SGT....... ....K.ATIV DVPFPDNVPE ugt90a1 MSVS...... ...THHHHVV LFPFMSKG.H IIPLLQFGRL LLRHHRKEPT ITVTVFTTPK NQPFI.SDFL SDT..P.... ....E.IKVI SLPFPENITG ugt85a2 MGS....... .HVAQKQHVV CVPYPAQG.H INPMMKVAKL LYAK...G.. FHITFVNTVY NHNR..LLR. SRG..PN.AV DG......LP SFRFESIP.. ugt85a1 MGSQII.... .HNSQKPHVV CVPYPAQG.H INPMMRVAKL LHAR...G.. FYVTFVNTVY NHNR..FLR. SRG..SN.AL DG......LP SFRFESIA.. ugt85a4 MEQHGG.... .SSSQKPHAM CIPYPAQG.H INPMLKLAKL LHAR...G.. FHVTFVNTDY NHRR..ILQ. SRG..PH.AL NG......LP SFRFETIP.. ugt76e1 MEELGV.... .....KRRIV LVPVPAQG.H VTPIMQLGKA LYSK...G.. FSITVVLTQY NR....VSS. S....KD.FS .......... DFHFLTIP.. ugt76e2 MEEKQV.... ....KETRIV LVPVPAQG.H VTPMMQLGKA LHSK...G.. FSITVVLTQS NR....VSS. S....KD.FS .......... DFHFLTIP.. ugt76e4 MEKRVE.... .....KRRIV LVPVAAQG.H VTPMMQLGKA LQSK...G.. FLITVAQRQF NQ....IGS. SL...QH.FP .......... GFDFVTIP.. ugt76e6 MEKMEE.... .....KKRIV LVPVPAQR.H VTPMMQLGTA LNMK...G.. FSITVVEGQF NK....VSS. S....QN.FP .......... GFQFVTIPD. ugt76e5 .EKNAE.... .....KKRIV LVPFPLQG.H ITPMMQLGQA LNLK...G.. FSITVALGDS NR....VSS. T....QH.FP .......... GFQFVTIP.. ugt76e3 MEKRVE.... .....KRRIV LVPLPLLG.H FTPMMQLGQA LILK...G.. FSIIVPQGEF NR....VNS. S....QK.FP .......... GFQFITIP.. ugt76e9 MEEKQE.... ....RRRRIV LIPAPAQG.H ISPMMQLARA LHLK...G.. FSITVAQTKF NY....LKP. S....KD.LA .......... DFQFITIP.. ugt76e7 MEEKLS.... ....RRRRVV LVPVPAQG.H ITPMIQLAKA LHSK...G.. FSITVVQTKF NY....LNP. S....ND.LS .......... DFQFVTIP.. ugt76e12 MEEKPA.... .....RRSVV LVPFPAQG.H ISPMMQLAKT LHLK...G.. FSITVVQTKF NY....FSP. S....DD.FT H......... DFQFVTIP.. ugt76e11 MEEKPA.... .....GRRVV LVAVPAQG.H ISPIMQLAKT LHLK...G.. FSITIAQTKF NY....FSP. S....DD.FT .......... DFQFVTIP.. ugt76d1 MAEIRQ.... ......RRVL MVPAPFQG.H LPSMMNLASY LSSQ...G.. FSITIVRNEF NF....KDI. S....HN.FP .......... GIKFFTIK.. ugt76c4 MEKSNG.... ......LRVI LFPLPLQG.C INPMIQLAKI LHSR...G.. FSITVIHTCF NA....PKA. S....SH.PL .......... .FTFIQIQ.. ugt76c3 MDKSNG.... ......LRVI LFPLPLQG.C INPMIQLAKI LHSR...G.. FSITVIHTRF NA....PKA. S....NH.PL .......... .FTFLQIP.. ugt76c5 MEKSNG.... ......LRVI LFPLPLQG.C INPMIQLAKI LHSR...G.. FSITVIHTCF NA....PKA. S....SH.PL .......... .FTFLEIP.. ugt76c1 MEKRNE.... ......RQVI LFPLPLQG.C INPMLQLAKI LYSR...G.. FSITIIHTRF NA....PKS. S....DH.PL .......... .FTFLQIR.. ugt86a1 MERAKS.... ....RKPHIM MIPYPLQG.H VIPFVHLAIK LASH...G.. FTITFVNTDS IHHH..IST. AHQDDAG.DI FSAARSSGQH DIRYTTVS.. ugt86a2 MADVRNPTKN HHGHHHLHAL LIPYPFQG.H VNPFVHLAIK LASQ...G.. ITVTFVNTHY IHHQ..ITN. GSD..G..DI FAGVRSESGL DIRYATVS.. ugt87a1 MNPIKP.... .QPLGVRHVV AMPWPGRG.H INPMLNLCKS LVRRD.PN.. LTVTFVVTEE WLGFI.GSD. P....K.... ........PN RIHFATLP.. ugt87a2 MDPNES.... .PPNQFRHVV AMPYPGRG.H INPMMNLCKR LVRRY.PN.. LHVTFVVTEE WLGFI.GPD. P....K.... ........PD RIHFSTLP.. ugt75b2 MAQ....... ......PHFL LVTFPAQG.H VNPSLRFARR LIKT..TG.. ARVTFATCLS VIHRS.MIP. N....HN.N. ........VE NLSFLTFS.. ugt75b1 MAP....... ......PHFL LVTFPAQG.H VNPSLRFARR LIKR..TG.. ARVTFVTCVS VFHNS.MIA. N....HN.K. ........VE NLSFLTFS.. ugt75d1 MANNNS.... .NSPTGPHFL FVTFPAQG.H INPSLELAKR LAGTI.SG.. ARVTFAASIS AYNRR.MFS. T....EN.V. ........PE TLIFATYS.. ugt75c1 MATSVN.... .GSHRRPHYL LVTFPAQG.H INPALQLANR LIHH...G.. ATVTYSTAVS AHRR...MG. E....PP.S. ........TK GLSFAWFT.. ugt84a3 MD........ .PS.RHTHVM LVSFPGQG.H VNPLLRLGKL IASK...G.. LLVTFVTTEK PWGKK.MRQ. A....NK.I. ........QD GVLKPVGLGF ugt84a4 MEM....... .ES.SLPHVM LVSFPGQG.H ISPLLRLGKI IASK...G.. LIVTFVTTEE PLGKK.MRQ. A....NN.I. ........QD GVLKPVGLGF ugt84a2 MELESS.... .PP.LPPHVM LVSFPGQG.H VNPLLRLGKL LASK...G.. LLITFVTTES .WGKK.MRI. S....NK.I. ........QD RVLKPVGKGY ugt84a1 MVFETC.... .PSPNPIHVM LVSFQGQG.H VNPLLRLGKL IASK...G.. LLVTFVTTEL .WGKK.MRQ. A....NK.I. ........VD GELKPVGSGS ugt84b1 MGSSEG.... ...QE.THVL MVTLPFQG.H INPMLKLAKH LSLS...S.. KNLHINLATI ESARD.LLS. .....T.... ........VE KPRYPVD... ugt84b2 MGSNEG.... ...QE.THVL MVALAFQG.H LNPMLKFAKH LART...... .NLHFTLATT EQARD.LLS. S....T.... ........AD EPHRPVD... ugt74e1 MRE....... ....G.SHVI VLPFPAQG.H ITPMSQFCKR LASK...S.. LKITLVLVSD KPSP..PYK. T....EH.D. .......... ..TITVVP.. ugt74e2 MRE....... ....G.SHLI VLPFPGQG.H ITPMSQFCKR LASK...G.. LKLTLVLVSD KPSP..PYK. T....EH.D. .......... ..SITVFP.. ugt74d1 MGE....... ...KAKANVL VFSFPIQG.H INPLLQFSKR LLSK...N.. VNVTFLTTSS THNS..ILR. R....AI.TG G.......AT ALPLSFVP.. ugt74c1 MSEA...... ....KKGHVL FFPYPLQG.H INPMIQLAKR LSKK...G.. ITSTLIIASK DHRE..PYT. S....DD.Y. .......... ..SITVHT.. ugt74f1 MEK....... ....MRGHVL AVPFPSQG.H ITPIRQFCKR LHSK...G.. FKTTHTLTTF IFNT..IHL. D....PS.S. .......... ..PISIAT.. ugt74f2 MEH....... ....KRGHVL AVPYPTQG.H ITPFRQFCKR LHFK...G.. LKTTLALTTF VFNS..INP. D....LS.G. .......... ..PISIAT.. ugt74b1 MAETTP.... ...KVKGHVV ILPYPVQG.H LNPMVQFAKR LVSK...N.. VKVTIATTTY TASS..ITT. .....PS... .......... ...LSVEP.. ugt72e2 MHIT...... .....KPHAA MFSSPGMG.H VIPVIELGKR LSANN.GF.. H.VTVFVLE. ....T.DAA. S.A..QSKFL NSTG...... .VDIVKLP.. ugt72e3 MHIT...... .....KPHAA MFSSPGMG.H VLPVIELAKR LSANH.GF.. H.VTVFVLE. ....T.DAA. S.V..QSKLL NSTG...... .VDIVNLP.. ugt72e1 MKIT...... .....KPHVA MFASPGMG.H IIPVIELGKR LAGSH.GF.. D.VTIFVLE. ....T.DAA. S.A..QSQFL NSPGCDA..A LVDIVGLP.. ugt72d1 MDQ....... ......PHAL LVASPGLG.H LIPILELGNR LSSVL.NI.. H.VTILAVT. ....S.GSS. SPT..ETEAI HAAAAR...T ICQITEIP.. ugt72c1 ME........ ......LHGA LVASPGMG.H AVPILELGKH LLNHH.GF.. DRVTVF.LV. ....T.DDV. S.R..SKSLI GKTLMEE.DP KFVIRFIP.. ugt72b1 MEESK..... .....TPHVA IIPSPGMG.H LIPLVEFAKR LVHLH.GL.. T.VTFVIAG. ....E.GPP. SKA..QRTVL DSLP.....S SISSVFLP.. ugt71b7 M......... .....KFELV FIPYPGIG.H LRSTVEMAKL LVDRE.TR.. LSISVIILP. ...FI.SEG. EVG..ASDYI AALSASS.NN RLRYEVIS.. ugt71b8 MN........ .....KFALV FVPFPILG.H LKSTAEMAKL LVEQE.TR.. LSISIIILP. ...LL.SGD. DVS..ASAYI SALSAAS.ND RLHYEVIS.. ugt71b6 M......... .....KIELV FIPSPAIS.H LMATVEMAEQ LVDKN.DN.. LSITVIIIS. ....F.SSK. N....TS.MI TSLTS...NN RLRYEIIS.. ugt71b5 M......... .....KIELV FIPLPGIG.H LRPTVKLAKQ LIGSE.NR.. LSITIIIIP. ..SRF.DAG. DA...SA.CI ASLTTLSQDD RLHYESIS.. ugt71b2 M......... .....KLELV FIPSPGDG.H LRPLVEVAKL HVDRD.DH.. LSITIIIIPQ .MHGF.SSS. NS...SS.YI ASLSSDS.EE RLSYNVLS.. ugt71b1 M......... .....KVELV FIPSPGVG.H IRATTALAKL LVASD.NR.. LSVTLIVIP. ..SRV.SD.. DA...SS... .SVYTNS.ED RLRYILLP.. ugt71c1 MGKQE..... .....DAELV IIPFPFSG.H ILATIELAKR LISQD.NP.. RIHTITILYW GLPFI.PQA. DT...IA.FL RSLVKN..EP RIRLVTLP.. ugt71c2 MAKQQ..... .....EAELI FIPFPIPG.H ILATIELAKR LISHQ.PS.. RIHTITILHW SLPFL.PQS. DT...IA.FL KSLIET..ES RIRLITLP.. ugt71d1 MR........ .....NVELI FIPTPTVG.H LVPFLEFARR LIEQD.DR.. IRITILLMK. ......LQG. QSH..LDTYV KSIASS..QP FVRFIDVP.. ugt71d2 MR........ .....NAELI FIPTPTVG.H LVPFLEFARR LIEQD.DR.. IRITFLLMK. ......QQG. QSH..LDSYV KTISSS..LP FVRFIDVP.. ugt88a1 MG........ .....EEAIV LYPAPPIG.H LVSMVELGKT ILSK...NPS LSIHIILVPP PYQ...PES. T....ATYIS SVSSS...FP SITFHHLP.. ugt79b5 MG........ ....SKFHAF MYPWFGFG.H MIPYLHLANK LAEK...G.. HRVTFF.LPK KA....HKQ. .....LQ.PL NLFPDSIVFE PLTLPPVDG. ugt79b4 MG........ ....SKFHAF LYPWFGFG.H MIPYLHLANK LAEK...G.. HRVTFL.APK KA....QKQ. .....LE.PL NLFPNSIHFE NVTLPHVDG. ugt79b7 ME........ ....PKFHAF MFPWFAFG.H MIPFLHLANK LAEK...G.. HRVTFL.LPK KA....QKQ. .....LE.HH NLFPDSIVFH PLTVPPVNG. ugt79b8 ME........ ....PTFHAF MFPWFAFG.H MIPFLHLANK LAEK...G.. HQITFL.LPK KA....QKQ. .....LE.HH NLFPDSIVFH PLTIPHVNG. ugt79b10 MG........ ....QTFHAF MFPWFAFG.H MTPYLHLANK LAER...G.. HRITFL.IPK KA....QKQ. .....LE.HL NLFPDSIVFH SLTIPHVDG. ugt79b11 MG........ ....QKIHAF MFPWFAFG.H MTPYLHLGNK LAEK...G.. HRVTFL.LPK KA....QKQ. .....LE.HQ NLFPHGIVFH PLVIPHVDG. ugt79b9 MG........ ....QNFHAF MFPWFAFG.H MTPYLHLANK LAAK...G.. HRVTFL.LPK KA....QKQ. .....LE.HH NLFPDRIIFH SLTIPHVDG. ugt79b6 MG........ ....SKFHAF MFPWFGFG.H MTAFLHLANK LAEK...D.. HKITFL.LPK KA....RKQ. .....LE.SL NLFPDCIVFQ TLTIPSVDG. ugt79b2 MGG....... ....LKFHVL MYPWFATG.H MTPFLFLANK LAEK...G.. HTVTFL.IPK KA....LKQ. .....LE.NL NLFPHNIVFR SVTVPHVDG. ugt79b3 MGG....... ....LKFHVL MYPWFATG.H MTPFLFLANK LAEK...G.. HTVTFL.LPK KS....LKQ. .....LE.HF NLFPHNIVFR SVTVPHVDG. ugt79b1 MGVFGSNE.. .S..SSMSIV MYPWLAFG.H MTPFLHLSNK LAEK...G.. HKIVFL.LPK KA....LNQ. .....LE.PL NLYPNLITFH TISIPQVKG. ugt91b1 MAEPK..... ....PKLHVA VFPWLALG.H MIPYLQLSKL IARK...G.. HTVSFISTAR NI....SRL. .....PN.IS SDLS..VNFV SLPLSQTVDH ugt91a1 MTNFKDND.. .GDGTKLHVV MFPWLAFG.H MVPYLELSKL IAQK...G.. HKVSFISTPR NI....DRLL .....PW.LP ENLSSVINFV KLSLPVGDNK ugt91c1 MVDKRE.... ....EVMHVA MFPWLAMG.H LLPFLRLSKL LAQK...G.. HKISFISTPR NI....ERL. .....PK.LQ SNLASSITFV SFPLPPISG. ugt89a2 MKVNEEN... .NKPTKTHVL IFPFPAQG.H MIPLLDFTHR LALRG..GAA LKITVLVTPK NLPFL.S... .....PLLSA VVN...IEPL ILPFPSHPS. ugt83a1 MDNNSN.... .KRMGRPHVV VIPYPAQG.H VLPLISFSRY LAKQ...G.. IQITFINTEF NHNR..IIS. S....LPNSP HEDYVG...D QINLVSIP.. ugt78d1 MTKFSEP... ...IRDSHVA VLAFFPVGAH AGPLLAVTRR LAAAS.PS.. TIFSFFNTAR SN....ASLF S....SDHP. .........E NIKVHDVS.. Motif 1

Figure

2

101 200 ugt73c3 .....LPEGK EN..IDSLDS T......ELM VPFFKAVNLL EDPVMKLMEE MK.PR..... .....PSCLI SDWCLPYTSI IAKNFNIPK. IVFHGMGCFN ugt73c4 .....LPEGK EN..IDSYDS M......ELM VPFFQAVNML EDPVMKLMEE MK.PR..... .....PSCII SDLLLPYTSK IARKFSIPK. IVFHGTGCFN ugt73c2 .....LQEGQ EN..VDFLDS M......ELM VHFFKAVNML ENPVMKLMEE MK.PK..... .....PSCLI SDFCLPYTSK IAKRFNIPK. IVFHGVSCFC ugt73c5 .....LQEGQ EN..IDSLDT M......ERM IPFFKAVNFL EEPVQKLIEE MN.PR..... .....PSCLI SDFCLPYTSK IAKKFNIPK. ILFHGMGCFC ugt73c6 .....LQEGQ EN..MDLLTT M......EQI TSFFKAVNLL KEPVQNLIEE MS.PR..... .....PSCLI SDMCLSYTSE IAKKFKIPK. ILFHGMGCFC ugt73c1 .....SPEGQ EN..LDLLDS L......GAS LTFFKAFSLL EEPVEKLLKE IQ.PR..... .....PNCII ADMCLPYTNR IAKNLGIPK. IIFHGMCCFN ugt73c7 .....LPEGC ES..LDMLAS M......GDM VKFFDAANSL EEQVEKAMEE MVQPR..... .....PSCII GDMSLPFTSR LAKKFKIPK. LIFHGFSCFS ugt73d1 .....LPKDC ET..LDTLPS K......DLL RRFYDAVDKL QEPMERFLEQ QDIP...... .....PSCII SDKCLFWTSR TAKRFKIPR. IVFHGMCCFS ugt73b4 .....LPEGC EN..RDFINS YQKSDSFDLF LKFLFSTKYM KQQLESFIET TK........ .....PSALV ADMFFPWATE SAEKIGVPR. LVFHGTSSFA ugt73b5 .....LPEGC EN..ADFINS YQKSDSGDLF LKFLFSTKYM KQQLESFIET TK........ .....PSALV ADMFFPWATE SAEKLGVPR. LVFHGTSFFS ugt73b2 .....LPEGC EN..VDFFTS NNNDDKNEMI VKFFFSTRFF KDQLEKLLGT TR........ .....PDCLI ADMFFPWATE AAGKFNVPR. LVFHGTGYFS ugt73b3 .....LPEGC EN..VDFFTS NNNDDRQYLT LKFFKSTRFF KDQLEKLLET TR........ .....PDCLI ADMFFPWATE AAEKFNVPR. LVFHGTGYFS ugt73b1 .....LPDGC EN..TDFIFS TPDLNVGDLS QKFLLAMKYF EEPLEELLVT MR........ .....PDCLV GNMFFPWSTK VAEKFGVPR. LVFHGTGYFS ugt90a2 .....IPPGV EC..TDKLPA LSS....SLF VPFTRATKSM QADFERELMS LP..R..... .....VSFMV SDGFLWWTQE SARKLGFPR. LVFFGMNCAS ugt90a1 .....IPPGV EN..TEKLPS MS......LF VPFTRATKLL QPFFEETLKT LP..K..... .....VSFMV SDGFLWWTSE SAAKFNIPR. FVSYGMNSYS ugt85a2 .......DGL PE...TDVDV T.....QDIP TLCESTMKHC LAPFKELLRQ INARDD.... ..VPPVSCIV SDGCMSFTLD AAEELGVPE. VLFWTTSACG ugt85a1 .......DGL PE...TDMDA T.....QDIT ALCESTMKNC LAPFRELLQR INAGDN.... ..VPPVSCIV SDGCMSFTLD VAEELGVPE. VLFWTTSGCA ugt85a4 .......DGL PW...TDVDA K.....QDML KLIDSTINNC LAPFKDLILR LNSGSD.... ..IPPVSCII SDASMSFTID AAEELKIPV. VLLWTNSATA ugt76e1 .......GSL TE...SDLKN ......LGPF KFLFKLNQIC EASFKQCIGQ LLQEQG.... ..N.DIACVV YDEYMYFSQA AVKEFQLPS. VLFSTTSATA ugt76e2 .......GSL TE...SDLQN ......LGPQ KFVLKLNQIC EASFKQCIGQ LLHEQC.... ..NNDIACVV YDEYMYFSHA AVKEFQLPS. VVFSTTSATA ugt76e4 .......ESL PQ...SESKK ......LGPA EYLMNLNKTS EASFKECISQ LSMQQ..... ..GNDIACII YDKLMYFCEA AAKEFKIPS. VIFSTSSATI ugt76e6 ......TESL PE...SVLER ......LGPV EFLFEINKTS EASFKDCIRQ SLLQQ..... ..GNDIACII YDEYMYFCGA AAKEFNLPS. VIFSTQSATN ugt76e5 .......ETI PL...SQHEA ......LGVV EFVVTLNKTS ETSFKDCIAH LLLQH..... ..GNDIACII YDELMYFSEA TAKDLRIPS. VIFTTGSATN ugt76e3 .......... .D...SELEA ......NGPV GSLTQLNKIM EASFKDCIRQ LLKQQ..... ..GNDIACII YDEFMYFCGA VAEELKLPN. FIFSTQTATH ugt76e9 .......ESL PA...SDLKN ......LGPV WFLLKLNKEC EFSFKECLGQ LLLQKQ...L IPEEEIACVI YDEFMYFAEA AAKEFNLPK. VIFSTENATA ugt76e7 .......ENL PV...SDLKN ......LGPG RFLIKLANEC YVSFKDLLGQ LLVNE..... ..EEEIACVI YDEFMYFVEV AVKEFKLRN. VILSTTSATAugt76e12 .......ESL PE...SDFKN ......LGPI QFLFKLNKEC KVSFKDCLGQ LVLQQ..... ..SNEISCVI YDEFMYFAEA AAKECKLPN. IIFSTTSATAugt76e11 .......ESL PE...SDFED ......LGPI EFLHKLNKEC QVSFKDCLGQ LLLQQ..... ..GNEIACVV YDEFMYFAEA AAKEFKLPN. VIFSTTSATA

3

ugt76d1 .......DGL SE...SDVKS ......LGLL EFVLELNSVC EPLLKEFLTN H......... ..DDVVDFII YDEFVYFPRR VAEDMNLPK. MVFSPSSAAT ugt76c4 .......DGL SE...TETRT ......RDVK LLITLLNQNC ESPVRECLRK LLQSAK.... EEKQRISCLI NDSGWIFTQH LAKSLNLMR. LAFNTYKISF ugt76c3 .......DGL SE...TETRT ......HDIT LLLTLLNRSC ESPFRECLTK LLQSADSETG EEKQRISCLI DDSGWIFTQP VAQSFNLPR. LVLNTYKVSF ugt76c5 .......DGL SE...TEKRT ......NNTK LLLTLLNRNC ESPFRECLSK LLQSADSETG EEKQRISCLI ADSGWMFTQP IAQSLKLPI. LVLSVFTVSF ugt76c1 .......DGL SE...SQTQS ......RDLL LQLTLLNNNC QIPFRECLAK LIKPSS.DSG TEDRKISCVI DDSGWVFTQS VAESFNLPR. FVLCAYKFSF ugt86a1 .......DGF PL...DFDRS ......LNHD QFFEGILHVF SAHVDDLIAK LSRRDD.... ..P.PVTCLI ADTFYVWSSM ICDKHNLVN. VSFWTEPALV ugt86a2 .......DGL PV...GFDRS ......LNHD TYQSSLLHVF YAHVEELVAS LVGGDG.... ....GVNVMI ADTFFVWPSV VARKFGLVC. VSFWTEAALV ugt87a1 .......NII PS...ELVRA ......NDFI AFIDAVLTRL EEPFEQLLDR LNS....... ....PPTAII ADTYIIWAVR VGTKRNIPV. ASFWTTSATI ugt87a2 .......NLI PS...ELVRA ......KDFI GFIDAVYTRL EEPFEKLLDS LNSP...... ....PPSVIF ADTYVIWAVR VGRKRNIPV. VSLWTMSATI ugt75b2 DG...FDDGV IS.NTDD... ........VQ NRLVHFERNG DKALSDFIEA NQNGD..... ...SPVSCLI YTILPNWVPK VARRFHLPS. VHLWIQPAFA ugt75b1 DG...FDDGG IS.TYED... ........RQ KRSVNLKVNG DKALSDFIEA TKNGD..... ...SPVTCLI YTILLNWAPK VARRFQLPS. ALLWIQPALV ugt75d1 DG...HDDGF KSSAYSDKSR Q.....DATG NFMSEMRRRG KETLTELIED NRKQN..... ...RPFTCVV YTILLTWVAE LAREFHLPS. ALLWVQPVTV ugt75c1 DG...FDDGL KS..FED... ........QK IYMSELKRCG SNALRDIIKA NLDATT.... .ETEPITGVI YSVLVPWVST VAREFHLPT. TLLWIEPATV ugt84a3 IRFEFFSDGF AD...DDEK. R.....FDFD AFRPHLEAVG KQEIKNLVKR YN........ ..KEPVTCLI NNAFVPWVCD VAEELHIPS. AVLWVQSCAC ugt84a4 LRFEFFEDGF VY...KE... .......DFD LLQKSLEVSG KREIKNLVKK YE........ ..KQPVRCLI NNAFVPWVCD IAEELQIPS. AVLWVQSCAC ugt84a2 LRYDFFDDGL PE...DDEAS R.....TNLT ILRPHLELVG KREIKNLVKR YKEVT..... ..KQPVTCLI NNPFVSWVCD VAEDLQIPC. AVLWVQSCAC ugt84a1 IRFEFFDEEW AE...DDDR. R.....ADFS LYIAHLESVG IREVSKLVRR YEEA...... ..NEPVSCLI NNPFIPWVCH VAEEFNIPC. AVLWVQSCAC ugt84b1 L..VFFSDGL PK...EDPK. .......APE TLLKSLNKVG AMNLSKIIEE KR........ .....YSCII SSPFTPWVPA VAASHNISC. AILWIQACGA ugt84b2 L..AFFSDGL PK...DDPR. .......DPD TLAKSLKKDG AKNLSKIIEE KR........ .....FDCII SVPFTPWVPA VAAAHNIPC. AILWIQACGA ugt74e1 .....ISNGF QEGQERSED. ........LD EYMERVESSI KNRLPKLIED MKLSG..... ...NPPRALV YDSTMPWLLD VAHSYGLSG. AVFFTQPWLV ugt74e2 .....ISNGF QEGEEPLQD. ........LD DYMERVETSI KNTLPKLVED MKLSG..... ...NPPRAIV YDSTMPWLLD VAHSYGLSG. AVFFTQPWLV ugt74d1 .....IDDGF EEDHPSTDT. ........SP DYFAKFQENV SRSLSELISS MDPK...... .....PNAVV YDSCLPYVLD VCRKHPGVAA ASFFTQSSTV ugt74c1 .....IHDGF FPHEHPHAK. ........FV D.LDRFHNST SRSLTDFISS AKLSD..... ...NPPKALI YDPFMPFALD IAKDLDLYV. VAYFTQPWLA ugt74f1 .....ISDGY DQGGFSSAG. .......SVP EYLQNFKTFG SKTVADIIRK HQSTD..... ...NPITCIV YDSFMPWALD LAMDFGLAA. APFFTQSCAV ugt74f2 .....ISDGY DHGGFETAD. .......SID DYLKDFKTSG SKTIADIIQK HQTSD..... ...NPITCIV YDAFLPWALD VAREFGLVA. TPFFTQPCAV ugt74b1 .....ISDGF DFIPIGIPGF .......SVD TYSESFKLNG SETLTLLIEK FKSTD..... ...SPIDCLI YDSFLPWGLE VARSMELSA. ASFFTNNLTV ugt72e2 .....SPDIY GL..VDP.DD H...VVTKIG VIMRAAVPAL RSKIAAMHQ. .......... ....KPTALI VDLFGTDALC LAKEFNMLS. YVFIPTNARF ugt72e3 .....SPDIS GL..VDP.NA H...VVTKIG VIMREAVPTL RSKIVAMHQ. .......... ....NPTALI IDLFGTDALC LAAELNMLT. YVFIASNARY ugt72e1 .....TPDIS GL..VDP.SA F...FGIKLL VMMRETIPTI RSKIEEMQH. .......... ....KPTALI VDLFGLDAIP LGGEFNMLT. YIFIASNARF

4

ugt72d1 .....SVDVD NL..VEP.DA T...IFTKMV VKMRAMKPAV RDAVKLMKR. .......... ....KPTVMI VDFLGTELMS VADDVGMTAK YVYVPTHAWF ugt72c1 ......LDVS GQDLSGSLLT .......KLA EMMRKALPEI KSSVMELEPR .......... .....PRVFV VDLLGTEALE VAKELGIMRK HVLVTTSAWF ugt72b1 .....PVDLT DLSSSTRIES .......RIS LTVTRSNPEL RKVFDSFVEG GR........ ....LPTALV VDLFGTDAFD VAVEFHVPP. YIFYPTTANV ugt71b7 .....AVDQP TI........ ....EMTTIE IHMKNQEPKV RSTVAKLLED YSSKPD.... ..SPKIAGFV LDMFCTSMVD VANEFGFPS. YMFYTSSAGI ugt71b8 .....DGDQP .......... .......TVG LHVDNHIPMV KRTVAKLVDD YSRRPD.... ..SPRLAGLV VDMFCISVID VANEVSVPC. YLFYTSNVGI ugt71b6 .....GGDQQ PT........ ....ELKATD SHIQSLKPLV RDAVAKLVD. .STLPD.... ..APRLAGFV VDMYCTSMID VANEFGVPS. YLFYTSNAGF ugt71b5 .....VAKQP PT..SDP... ....DPVPAQ VYIEKQKTKV RDAVAARIVD ....P..... ..TRKLAGFV VDMFCSSMID VANEFGVPC. YMVYTSNATF ugt71b2 .....VPDKP DS..DD.... ....TKPHFF DYIDNFKPQV KATVEKLTDP GPPDS..... ..PSRLAGFV VDMFCMMMID VANEFGVPS. YMFYTSNATF ugt71b1 .....ARDQT .......... ......TDLV SYIDSQKPQV RAVVSKVAGD VSTRS..... ..DSRLAGIV VDMFCTSMID IADEFNLSA. YIFYTSNASY ugt71c1 .....EVQDP PP..MELFVE F...AESYIL EYVKKMVPII REALSTLLSS .RDESG.... ..SVRVAGLV LDFFCVPMID VGNEFNLPS. YIFLTCSAGF ugt71c2 .....DVQNP PP..MELFVK A...SESYIL EYVKKMVPLV RNALSTLLSS .RDESD.... ..SVHVAGLV LDFFCVPLID VGNEFNLPS. YIFLTCSASF ugt71d1 .....ELEEK PT..LGS.TQ S...VEAYVY DVIERNIPLV RNIVMDILTS .LALDG.... ...VKVKGLV VDFFCLPMID VAKDISLPF. YVFLTTNSGF ugt71d2 .....ELEEK PT..LG..TQ S...VEAYVY DFIETNVPLV QNIIMGILSS .PAFDG.... ...VTVKGFV ADFFCLPMID VAKDASLPF. YVFLTSNSGF ugt88a1 .....AVTPY SSSSTSRHHH ES.....LLL EILCFSNPSV HRTLFSLSRN FN........ .....VRAMI IDFFCTAVLD ITADFTFPV. YFFYTSGAAC ugt79b5 .....LPFGA ET..ASDLPN S.......TK KPIFVAMDLL RDQIEAKVRA LK........ .....PDLIF FDFV.HWVPE MAEEFGIKS. VNYQIISAAC ugt79b4 .....LPVGA ET..TADLPN S.......SK RVLADAMDLL REQIEVKIRS LK........ .....PDLIF FDFV.DWIPQ MAKELGIKS. VSYQIISAAF ugt79b7 .....LPAGA ET..TSDIPI S.......LD NLLSKALDLT RDQVEAAVRA LR........ .....PDLIF FDFA.QWIPD MAKEHMIKS. VSYIIVSATT ugt79b8 .....LPAGA ET..TSDISI S.......MD NLLSEALDLT RDQVEAAVRA LR........ .....PDLIF FDFA.HWIPE IAKEHMIKS. VSYMIVSATTugt79b10 .....LPAGA ET..FSDIPM P.......LW KFLPPAIDLT RDQVEAAVSA LS........ .....PDLIL FDIA.SWVPE VAKEYRVKS. MLYNIISATSugt79b11 .....LPAGA ET..ASDIPI S.......LV KFLSIAMDLT RDQIEAAIGA LR........ .....PDLIL FDLA.HWVPE MAKALKVKS. MLYNVMSATS ugt79b9 .....LPAGA ET..ASDIPI S.......LG KFLTAAMDLT RDQVEAAVRA LR........ .....PDLIF FDTA.YWVPE MAKEHRVKS. VIYFVISANS ugt79b6 .....LPDGA ET..TSDIPI S.......LG SFLASAMDRT RIQVKEAVSV GK........ .....PDLIF FDFA.HWIPE IAREYGVKS. VNFITISAAC ugt79b2 .....LPVGT ET..VSEIPV T.......SA DLLMSAMDLT RDQVEGVVRA VE........ .....PDLIF FDFA.HWIPE VARDFGLKT. VKYVVVSAST ugt79b3 .....LPVGT ET..ASEIPV T.......ST DLLMSAMDLT RDQVEAVVRA VE........ .....PDLIF FDFA.HWIPE VARDFGLKT. VKYVVVSAST ugt79b1 .....LPPGA ET..NSDVPF F.......LT HLLAVAMDQT RPEVETIFRT IK........ .....PDLVF YDSA.HWIPE IAKPIGAKT. VCFNIVSAAS ugt91b1 .....LPENA EA..TTDVPE T.......HI AYLKKAFDGL SEAFTEFLEA SK........ .....PNWIV YDILHHWVPP IAEKLGVRR. AIFCTFNAAS ugt91a1 .....LPEDG EA..TTDVPF E.......LI PYLKIAYDGL KVPVTEFLES SK........ .....PDWVL QDFAGFWLPP ISRRLGIKT. GFFSAFNGAT ugt91c1 .....LPPSS ES..SMDVPY N.......KQ QSLKAAFDLL QPPLKEFLRR SS........ .....PDWII YDYASHWLPS IAAELGISK. AFFSLFNAAT ugt89a2 .....IPSGV EN..VQDLPP .......SGF PLMIHALGNL HAPLISWITS HPSP...... .....PVAIV SDFFLGWTKN LG....IPR. FDFSPSAAIT

5

ugt83a1 .......DGL ED..SPEERN .......IPG KLSESVLRFM PKKVEELIER MMAETSG... ..GTIISCVV ADQSLGWAIE VAAKFGIRR. TAFCPAAAAS ugt78d1 .......DGV PEG..TMLGN ........PL EMVELFLEAA PRIFRSEIAA AEIEVG.... ...KKVTCML TDAFFWFAAD IAAELNATW. VAFWAGGANS Motif 2 Motif 3

Figure S1 (continued)

201 300 ugt73c3 LLCMHVLRRN LEILEN..VK ....SD...E E.....YF.L VPSFP....D RVEFTKLQ.. .LPVKA.... NA...SGDWK EIMDEMVKA. EYTSYGVIVN ugt73c4 LLCMHVLRRN LEILKN..LK ....SD...K D.....YF.L VPSFP....D RVEFTKPQ.. .VPVET.... TA...SGDWK AFLDEMVEA. EYTSYGVIVN ugt73c2 LLSMHILHRN HNILHA..LK ....SD...K E.....YF.L VPSFP....D RVEFTKLQ.. .VTVKT.... NF...SGDWK EIMDEQVDA. DDTSYGVIVN ugt73c5 LLCMHVLRKN REILDN..LK ....SD...K E.....LF.T VPDFP....D RVEFTRTQ.. .VPVET.... YVP..AGDWK DIFDGMVEA. NETSYGVIVN ugt73c6 LLCVNVLRKN REILDN..LK ....SD...K E.....YF.I VPYFP....D RVEFTRPQ.. .VPVET.... YVP..AG.WK EILEDMVEA. DKTSYGVIVN ugt73c1 LLCTHIMHQN HEFLET..IE ....SD...K E.....YF.P IPNFP....D RVEFTKSQ.. .LPMVL.... VA....GDWK DFLDGMTEG. DNTSYGVIVN ugt73c7 LMSIQVVRES .GILKM..IE ....SN...D E.....YF.D LPGLP....D KVEFTKPQ.. .VSVLQ.... PV...EGNMK ESTAKIIEA. DNDSYGVIVN ugt73d1 LLSSHNIHLH SPHLS...VS ....SA...V E.....PF.P IPGMP....H RIEIARAQ.. .LPGAF.... EK...LANMD DVREKMRES. ESEAFGVIVN ugt73b4 LCCSYNMRIH KPHKK...VA ....SS...S T.....PF.V IPGLP....G DIVITEDQ.. .ANVTN.... EE...TP.FG KFWKEVRES. ETSSFGVLVN ugt73b5 LCCSYNMRIH KPHKK...VA ....TS...S T.....PF.V IPGLP....G DIVITEDQ.. .ANVAK.... EE...TP.MG KFMKEVRES. ETNSFGVLVN ugt73b2 LCAGYCIGVH KPQKR...VA ....SS...S E.....PF.V IPELP....G NIVITEEQ.. .IIDGD.... GE...SD.MG KFMTEVRES. EVKSSGVVLN ugt73b3 LCSEYCIRVH NPQNI...VA ....SR...Y E.....PF.V IPDLP....G NIVITQEQ.. .IADRD.... EE...SE.MG KFMIEVKES. DVKSSGVIVN ugt73b1 LCASHCIRLP ..KN....VA ....TS...S E.....PF.V IPDLP....G DILITEEQ.. .VMETE.... EE...SV.MG RFMKAIRDS. ERDSFGVLVN ugt90a2 TVICDSVFQN QLLSN...VK ....SE...T E.....PV.S VPEFPWIKVR KCDFVKDM.. .FDPKT.... TT...DPGFK LILDQVTSM. .NQSQGIIFN ugt90a1 AAVSISVFKH ELFTEPE.SK ....SD...T E.....PV.T VPDFPWIKVK KCDFDHGT.. .TEPEE.... S....GAALE LSMDQIKST. .TTSHGFLVN ugt85a2 FLAYLYYYRF IEKGLSPIKD ....ESYLTK EHLD.TKIDW IPSMK....N .LRLKDIPS. FIRTTN.... PD...DIMLN FIIREADRA. .KRASAIILN ugt85a1 FLAYLHFYLF IEKGLCPLKD ....ESYLTK EYLEDTVIDF IPTMK....N .VKLKDIPS. FIRTTN.... PD...DVMIS FALRETERA. .KRASAIILN ugt85a4 LILYLHYQKL IEKEIIPLKD ....SSDL.K KHLE.TEIDW IPSMK....K .IKLKDFPD. FVTTTN.... PQ...DPMIS FILHVTGRI. .KRASAIFIN ugt76e1 FVCRSVLSRV NAESFLLDMK ....DPK..V S.....DK.E FPGLH....P .LRYKDLP.. .TSAFG.... PL...ESILK VYSETVN... IRTASAVIIN ugt76e2 FVCRSVLSRV NAESFLIDMK ....DPE..T Q.....DK.V FPGLH....P .LRYKDLP.. .TSVFG.... PI...ESTLK VYSETVN... TRTASAVIIN ugt76e4 QVCYCVLSEL SAEKFLIDMK ....DPE..K Q.....DK.V LEGLH....P .LRYKDLP.. .TSGFG.... PL...EPLLE MCREVVN... KRTASAVIIN ugt76e6 QVSRCVLRKL SAEKFLVDME ....DPE..V Q.....ET.L VENLH....P .LRYKDLP.. .TSGVG.... PL...DRLFE LCREIVN... KRTASAVIIN ugt76e5 HVCSCILSKL NAEKFLIDMK ....DPE..V Q.....NM.V VENLH....P

6

.LKYKDLP.. .TSGMG.... PL...ERFLE ICAEVVN... KRTASAVIIN ugt76e3 KVCCNVLSKL NAKKYLIDME ....EHD..V Q.....NK.V VENMH....P .LRYKDLP.. .TATFG.... EL...EPFLE LCRDVVN... KRTASAVIIN ugt76e9 FACRSAMCKL YAKDGLAPLK ....EGCG.R E.....EE.L VPKLH....P .LRYKDLP.. .TSAFA.... PV...EASVE VFKSSCD... KGTASAMIIN ugt76e7 FVCRFVMCEL YAKDGLAQLK ....EGGE.R E.....VE.L VPELY....P .IRYKDLP.. .SSVFA.... SV...ESSVE LFKNTCY... KGTASSVIINugt76e12 FACRSVFDKL YANNVQAPLK ....ETKG.Q Q.....EE.L VPEFY....P .LRYKDFP.. .VSRFA.... SL...ESIME VYRNTVD... KRTASSVIINugt76e11 FVCRSAFDKL YANSILTPLK ....EPKG.Q Q.....NE.L VPEFH....P .LRCKDFP.. .VSHWA.... SL...ESMME LYRNTVD... KRTASSVIIN ugt76d1 SISRCVLMEN QSNGLLPPQD ....ARS..Q L.....EE.T VPEFH....P .FRFKDLP.. .FTAYG.... SM...ERLMI LYENVSN... RASSSGIIHN ugt76c4 FRSHFVLPQL RREMFLP.LQ ....DSE..Q ......DD.P VEKFP....P .LRKKDL... .LRILE.... AD...SVQGD SYSDMILEK. TKASSGLIFM ugt76c3 FRDHFVLPQL RREMYLP.LQ ....DSE..Q G.....DD.P VEEFP....P .LRKKDL... .LQILD.... QE...SEQLD SYSNMILET. TKASSGLIFV ugt76c5 FRCQFVLPKL RREVYLP.LQ ....DSE..Q ......ED.L VQEFP....P .LRKKDI... .VRILD.... VE...TDILD PFLDKVLQM. TKASSGLIFM ugt76c1 FLGHFLVPQI RREGFLP.VP ....DSE..A ......DD.L VPEFP....P .LRKKDL... .SRIMGTS.. AQ...SKPLD AYLLKILDA. TKPASGIIVM ugt86a1 LNLYYHMDLL ISNGHFKSLD ....NR.... K....DVIDY VPGVK....A .IEPKDLMS. YLQVSDKDVD TN...TVVYR ILFKAFKDV. .KRADFVVCN ugt86a2 FSLYYHMDLL RIHGHFGAQE ....TR.... S....DLIDY IPGVA....A .INPKDTAS. YLQET....D TS...SVVHQ IIFKAFEDV. .KKVDFVLCN ugt87a1 LSLFINSDLL ASHGHFPIEP S...ESK..L D....EIVDY IPGLS....P .TRLSDLQ.. ILHGY..... .S...HQVFN IFKKSFGEL. .YKAKYLLFP ugt87a2 LSFFLHSDLL ISHGHALFEP S...E..... E....EVVDY VPGLS....P .TKLRDLPP. IFDGY..... .S...DRVFK TAKLCFDEL. .PGARSLLFT ugt75b2 FDIYYNYSTG .......... ....NN.... .....SVF.E FPNLP....S .LEIRDLPS. FLSPSN.... TN...KAAQA VYQELMDFLK EESNPKILVN ugt75b1 FNIYYTHFMG .......... ....NK.... .....SVF.E LPNLS....S .LEIRDLPS. FLTPSN.... TN...KGAYD AFQEMMEFLI KETKPKILIN ugt75d1 FSIFYHYFNG YEDAISE.MA ....NT...P S....SSI.K LPSLP....L .LTVRDIPS. FIVSSN.... VY...AFLLP AFREQIDSLK EEINPKILIN ugt75c1 LDIYYYYFNT S....YK.HL ....FD...V E....P.I.K LPKLP....L .ITTGDLPS. FLQPSK.... AL...PSALV TLREHIEALE TESNPKILVN ugt84a3 LTAYYYYHHR L..VKFP.TK ....TE...P D....ISV.E IPCLP....L .LKHDEIPS. FLHPSS.... PY...TAFGD IILDQLKRFE NHKSFYLFID ugt84a4 LAAYYYYHHQ L..VKFP.TE ....TE...P E....ITV.D VPFKP....L TLKHDEIPS. FLHPSS.... PL...SSIGG TILEQIKRL. .HKPFSVLIE ugt84a2 LAAYYYYHHN L..VDFP.TK ....TE...P E....IDV.Q ISGMP....L .LKHDEIPS. FIHPSS.... PH...SALRE VIIDQIKRL. .HKTFSIFID ugt84a1 FSAYYHYQDG S..VSFP.TE ....TE...P E....LDV.K LPCVP....V .LKNDEIPS. FLHPSS.... RF...TGFRQ AILGQFKNL. .SKSFCVLID ugt84b1 YSVYYRYYMK T..NSFP.DL ....ED...L N....QTV.E LPALP....L .LEVRDLPS. FMLPSG.... G....AHFYN LMAEFADCL. .RYVKWVLVN ugt84b2 FSVYYRYYMK T..NPFP.DL ....ED...L N....QTV.E LPALP....L .LEVRDLPS. LMLPSQ.... G....ANVNT LMAEFADCL. .KDVKWVLVN ugt74e1 SAIYYHV..F KGSFSVPSTK ....YG...H S....TLA.S FPSLP....I .LNANDLPS. FLCESS.... SY...PYILR TVIDQLSNI. .DRVDIVLCN ugt74e2 TAIYYHV..F KGSFSVPSTK ....YG...H S....TLA.S FPSFP....M .LTANDLPS. FLCESS.... SY...PNILR IVVDQLSNI. .DRVDIVLCN ugt74d1 NATYIHF..L RGEFKEFQND .......... .......V.V LPAMP....P .LKGNDLPV. FLYDNN.... LC...RPLFE LISSQFVNV. .DDIDFFLVN ugt74c1 SLVYYHI..N EGTYDVPVDR ....HE...N P....TLA.S FPGFP....L .LSQDDLPS. FACEKG.... SY...PLLHE FVVRQFSNL. .LQADCILCN ugt74f1 NYINYLSYIN NGSLTLP... .......... .......... IKDLP....L

7

.LELQDLPT. FVTPTG.... SH...LAYFE MVLQQFTNF. .DKADFVLVN ugt74f2 NYVYYLSYIN NGSLQLP... .......... .......... IEELP....F .LELQDLPS. FFSVSG.... SY...PAYFE MVLQQFINF. .EKADFVLVN ugt74b1 CSVLRK..FS NGDFPLPADP .........N S....APF.R IRGLP....S .LSYDELPS. FVGRHWL... TH...PEHGR VLLNQFPNH. .ENADWLFVN ugt72e2 LGVSIYYPNL DKDIKEEHTV .........Q R....NPL.A IPGCE....P .VRFEDTLDA YLVPD..... .....EPVYR DFVRHGLAY. .PKADGILVN ugt72e3 LGVSIYYPTL DEVIKEEHTV .........Q R....KPL.T IPGCE....P .VRFEDIMDA YLVPD..... .....EPVYH DLVRHCLAY. .PKADGILVN ugt72e1 LAVALFFPTL DKDMEEEHII .........K K....QPM.V MPGCE....P .VRFEDTLET FLDPN..... .....SQLYR EFVPFGSVF. .PTCDGIIVN ugt72d1 LAVMVYLPVL DTVVEGEYVD .........I K....EPL.K IPGCK....P .VGPKELMET MLDRS..... .....GQQYK ECVRAGLEV. .PMSDGVLVN ugt72c1 LAFTVYMASL DKQELYKQLS .........S I....GAL.L IPGCS....P .VKFERAQD. .PRKY..... .....IRELA ESQRIGDEV. .ITADGVFVN ugt72b1 LSFFLHLPKL DETVSCEFRE L...T..... .....EPL.M LPGCV....P .VAGKDFLDP AQDRK..... .....DDAYK WLLHNTKRY. .KEAEGILVN ugt71b7 LSVTYHVQML CD.ENKYDVS ENDYADS..E A.....VL.N FPSLSR...P .YPVKCLP.. .HALAA.... .....NMWLP VFVNQARKF. .REMKGILVN ugt71b8 LALGLHIQML FD.KKEYSVS ETDFEDS..E V.....VL.D VPSLTC...P .YPVKCLP.. .YGLAT.... .....KEWLP MYLNQGRRF. .REMKGILVN ugt71b6 LGLLLHIQFM YDAEDIYDMS ..ELEDS..D V.....EL.V VPSLTS...P .YPLKCLP.. .YIFKS.... .....KEWLT FFVTQARRF. .RETKGILVN ugt71b5 LGTMLHVQQM YD.QKKYDVS ..ELENS..V T.....EL.E FPSLTR...P .YPVKCLP.. .HILTS.... .....KEWLP LSLAQARCF. .RKMKGILVN ugt71b2 LGLQVHVEYL YD.VKNYDVS ..DLKDS..D TT....EL.E VPCLTR...P .LPVKCFP.. .SVLLT.... .....KEWLP VMFRQTRRF. .RETKGILVN ugt71b1 LGLQFHVQSL YD.EKELDVS ..EFKDT..E M.....KF.D VPTLTQ...P .FPAKCLP.. .SVMLN.... .....KKWFP YVLGRARSF. .RATKGILVN ugt71c1 LGMMKYLPER HR.EIKSEFN ....RSF..N EE....LN.L IPGYVN...S .VPTKVLP.. .SGLFM.... .....KETYE PWVELAERF. .PEAKGILVN ugt71c2 LGMMKYLLER NR.ETKPELN ....RSS..D EE....TI.S VPGFVN...S .VPVKVLP.. .PGLFT.... .....TESYE AWVEMAERF. .PEAKGILVN ugt71d1 LAMMQYLADR HS.RDTSVFV ....RN...S EE....ML.S IPGFVN...P .VPANVLP.. .SALFV.... .....EDGYD AYVKLAILF. .TKANGILVN ugt71d2 LAMMQYLAYG HK.KDTSVFA ....RN...S EE....ML.S IPGFVN...P .VPAKVLP.. .SALFI.... .....EDGYD ADVKLAILF. .TKANGILVN ugt88a1 LAFSFYLPTI DETTPGKNLK .........D I.....PTVH IPGVP....P .MKGSDMPKA VLERD..... .....DEVYD VFIMFGKQL. .SKSSGIIIN ugt79b5 VAMVLAPR.. .......... .........A E.....LGFP PPDYP...LS KVALRGHEAN VCSLFANS.. .....HELFG LITKG..... LKNCDVVSIR ugt79b4 IAMFFAPR.. .......... .........A E.....LGSP PPGFP...SS KVALRGHDAN IYSLFANTR. .....KFLFD RVTTG..... LKNCDVIAIR ugt79b7 IAHTHVPG.. .......... .........G K.....LGVR PPGYP...SS KVMFRENDVH ALATLSIFY. .....KRLYH QITTG..... LKSCDVIALR ugt79b8 IAYTFAPG.. .......... .........G V.....LGVP PPGYP...SS KVLYRENDAH ALATLSIFY. .....KRLYH QITTG..... FKSCDIIALRugt79b10 IAHDFVPG.. .......... .........G E.....LGVP PPGYP...SS KLLYRKHDAH ALLSFSVYY. .....KRFSH RLITG..... LMNCDFISIRugt79b11 IAHDLVPG.. .......... .........G E.....LGVA PPGYP...SS KALYREHDAH ALLTFSGFY. .....KRFYH RFTTG..... LMNCDFISIR ugt79b9 IAHELVPG.. .......... .........G E.....LGVP PPGYP...SS KVLYRGHDAH ALLTFSIFY. .....ERLHY RITTG..... LKNCDVISIR ugt79b6 VAISFVPGRS Q......... .........D D.....LGST PPGYP...SS KVLLRGHETN SLSFLSYPF. GD..GTSFYE RIMIG..... LKNCDVISIR ugt79b2 IASMLVPG.. .......... .........G E.....LGVP PPGYP...SS KVLLRKQDAY TMKNLESTN. TINVGPNLLE RVTTS..... LMNSDVIAIR ugt79b3 IASMLVPG.. .......... .........G E.....LGVP PPGYP...SS

8

KVLLRKQDAY TMKKLEPTN. TIDVGPNLLE RVTTS..... LMNSDVIAIR ugt79b1 IALSLVPSAE REVIDGK.EM ....SG...E E.....LAKT PLGYP...SS KVVLRPHEAK SLSFVWRKH. ..EAIGSFFD GKVTA..... MRNCDAIAIR ugt91b1 IIIIGGPASV MIQGHDP.RK ....TA...E D.....LIVP PPWVPF..ET NIVYRLFEAK RIMEYPTAG. VT..GVELND NCRLGLA... YVGSEVIVIR ugt91a1 LGILKPPG.. FE...EY.RT ....SP...A D.....FMKP PKWVPF..ET SVAFKLFECR FIFKGFMAE. TT..EGNVPD IHRVGGV... IDGCDVIFVR ugt91c1 LCFMGPSSSL IE...EI.RS ....TP...E D.....FTVV PPWVPF..KS NIVFRYHEVT RYVEKTEED. VT..GVS..D SVRFGYS... IDESDAVFVR ugt89a2 CCILNTLWIE MPTKINEDDD .........N ......EILH FPKIPN..CP KYRFDQISSL YRSYVHG... .....DPAWE FIRDSFRDN. .VASWGLVVN ugt83a1 MVLGFSIQKL IDDGLIDSD. ....GTVR.V N....KTIQL SPGMP....K .METDKFVWV CLKNKE.... SQ...KNIFQ LMLQNNNSI. .ESTDWLLCN ugt78d1 LCAHLYTDLI RETIGLK... ....DVS..M E....ETLGF IPGME....N .YRVKDIPEE VVFEDL.... .....DSVFP KALYQMS.LA LPRASAVFIS Motif 4

Figure S1 (continued)

9

301 400 ugt73c3 T.FQELEPPY V..KDYKEAM ......DG.. .KVWSIGPVS LCNKAGADK. ......AERG SKA..AIDQD ECLQWLDSKE EGS..VLYVC LGSIC.NLPL ugt73c4 T.FQELEPAY V..KDYTKAR ......AG.. .KVWSIGPVS LCNKAGADK. ......AERG NQA..AIDQD ECLQWLDSKE DGS..VLYVC LGSIC.NLPL ugt73c2 T.FQDLESAY V..KNYTEAR ......AG.. .KVWSIGPVS LCNKVGEDK. ......AERG NKA..AIDQD ECIKWLDSKD VES..VLYVC LGSIC.NLPL ugt73c5 S.FQELEPAY A..KDYKEVR ......SG.. .KAWTIGPVS LCNKVGADK. ......AERG NKS..DIDQD ECLKWLDSKK HGS..VLYVC LGSIC.NLPL ugt73c6 S.FQELEPAY A..KDFKEAR ......SG.. .KAWTIGPVS LCNKVGVDK. ......AERG NKS..DIDQD ECLEWLDSKE PGS..VLYVC LGSIC.NLPL ugt73c1 T.FEELEPAY V..RDYKKVK ......AG.. .KIWSIGPVS LCNKLGEDQ. ......AERG NKA..DIDQD ECIKWLDSKE EGS..VLYVC LGSIC.NLPL ugt73c7 T.FEELEVDY A..REYRKAR ......AG.. .KVWCVGPVS LCNRLGLDK. ......AKRG DKA..SIGQD QCLQWLDSQE TGS..VLYVC LGSLC.NLPL ugt73d1 S.FQELEPGY A..EAYAEAI ......NK.. .KVWFVGPVS LCNDRMADL. ......FDRG SNGNIAISET ECLQFLDSMR PRS..VLYVS LGSLC.RLIP ugt73b4 S.FYELESSY A..DFYRSFV ......AK.. .KAWHIGPLS LSNRGIAEK. ......AGRG KKA..NIDEQ ECLKWLDSKT PGS..VVYLS FGSGT.GLPN ugt73b5 S.FYELESAY A..DFYRSFV ......AK.. .RAWHIGPLS LSNRELGEK. ......ARRG KKA..NIDEQ ECLKWLDSKT PGS..VVYLS FGSGT.NFTN ugt73b2 S.FYELEHDY A..DFYKSCV ......QK.. .RAWHIGPLS VYNRGFEEK. ......AERG KKA..NIDEA ECLKWLDSKK PNS..VIYVS FGSVA.FFKN ugt73b3 S.FYELEPDY A..DFYKSVV ......LK.. .RAWHIGPLS VYNRGFEEK. ......AERG KKA..SINEV ECLKWLDSKK PDS..VIYIS FGSVA.CFKN ugt73b1 S.FYELEQAY S..DYFKSFV ......AK.. .RAWHIGPLS LGNRKFEEK. ......AERG KKA..SIDEH ECLKWLDSKK CDS..VIYMA FGTMS.SFKN ugt90a2 T.FDDLEPVF I..DFYKRKR ......KL.. .KLWAVGPLC YVNNFLDDEV EE........ ......KVKP SWMKWLDEKR DKGCNVLYVA FGSQA.EISR ugt90a1 S.FYELESAF V..DYNNNSG D.....KP.. .KSWCVGPLC LTDPPKQ... ........G. ......SAKP AWIHWLDQKR EEGRPVLYVA FGTQA.EISN ugt85a2 T.FDDLEHDV I..QSMK.S. .....IVP.. .PVYSIGPLH LLEKQESGEY SEIGRTGSNL W.....REET ECLDWLNTKA RNS..VVYVN FGSIT.VLSA ugt85a1 T.FDDLEHDV V..HAMQ.S. .....ILP.. .PVYSVGPLH LLANREIEEG SEIGMMSSNL W.....KEEM ECLDWLDTKT QNS..VIYIN FGSIT.VLSV ugt85a4 T.FEKLEHNV L..LSLR.S. .....LLP.. .QIYSVGPFQ ILENREIDKN SEIRKLGLNL W.....EEET ESLDWLDTKA EKA..VIYVN FGSLT.VLTS ugt76e1 S.TSCLESSS L..AWLQKQ. .....LQV.. .PVYPIGPLH IAASAP.... .......SSL L.....EEDR SCLEWLNKQK IGS..VIYIS LGSLA.LMET ugt76e2 S.ASCLESSS L..ARLQQQ. .....LQV.. .PVYPIGPLH ITASAP.... .......SSL L.....EEDR SCVEWLNKQK SNS..VIYIS LGSLA.LMDT ugt76e4 T.ASCLESLS L..SWLQQE. .....LGI.. .PVYPLGPLH ITASSPG... .......PSL L.....QEDM SCIEWLNKQK PRS..VIYIS LGTKA.HMET ugt76e6 T.VRCLESSS L..KRLQHE. .....LGI.. .PVYALGPLH ITVSAA.... .......SSL L.....EEDR SCVEWLNKQK PRS..VVYIS LGSVV.QMET ugt76e5 T.SSCLESSS L..SWLKQE. .....LSI.. .PVYPLGPLH ITTSAN.... .......FSL L.....EEDR SCIEWLNKQK LRS..VIYIS VGSIA.HMET ugt76e3 T.VTCLESSS L..TRLQQE. .....LQI.. .PVYPLGPLH ITDSSTG... .......FTV L.....QEDR SCVEWLNKQK PRS..VIYIS LGSMV.LMET ugt76e9 T.VRCLEISS L..EWLQQE. .....LKI.. .PIYPIGPLH MVSSAPP... .......TSL L.....DENE SCIDWLNKQK PSS..VIYIS LGSFT.LLET ugt76e7 T.VRCLEMSS L..EWLQQE. .....LEI.. .PVYSIGPLH MVVSAPP... .......TSL L.....EENE SCIEWLNKQK PSS..VIYIS LGSFT.LMETugt76e12 T.ASCLESSS L..SFLQQQQ .....LQI.. .PVYPIGPLH MVASAP.... .......TSL L.....EENK SCIEWLNKQK VNS..VIYIS MGSIA.LMEIugt76e11 T.ASCLESSS L..SRLQQQ. .....LQI.. .PVYPIGPLH LVASAS.... .......TSL L.....EENK SCIEWLNKQK KNS..VIFVS LGSLA.LMEI

10

ugt76d1 S.SDCLENSF I..TTAQEK. .....WGV.. .PVYPVGPLH MTNSAMSC.. .......PSL F.....EEER NCLEWLEKQE TSS..VIYIS MGSLA.MTQD ugt76c4 S.CEELDQDS L..SQSRED. .....FKV.. .PIFAIGPSH SH.FPASS.. .......SSL F.....TPDE TCIPWLDRQE DKS..VIYVS IGSLV.TINE ugt76c3 STCEELDQDS L..SQARED. .....YQV.. .PIFTIGPSH SY.FPGSS.. .......SSL F.....TVDE TCIPWLDKQE DKS..VIYVS FGSIS.TIGE ugt76c5 S.CEELDHDS V..SQARED. .....FKI.. .PIFGIGPSH SH.FPATS.. .......SSL S.....TPDE TCIPWLDKQE DKS..VIYVS YGSIV.TISE ugt76c1 S.CKELDHDS L..AESNKV. .....FSI.. .PIFPIGPFH IHDVPASS.. .......SSL L.....EPDQ SCIPWLDMRE TRS..VVYVS LGSIA.SLNE ugt86a1 T.VQELEPDS L..SALQ.A. ...K...Q.. .PVYAIGPVF STD.SV.... .....VPTSL W.....AESD .CTEWLKGRP TGS..VLYVS FGSYA.HVGK ugt86a2 T.IQQFEDKT I..KALN.T. ...K...I.. .PFYAIGPII PFNNQTGS.. .....VTTSL W.....SESD .CTQWLNTKP KSS..VLYIS FGSYA.HVTK ugt87a1 S.AYELEPKA I..DFFT.S. ...KF.DF.. .PVYSTGPLI PLE.ELS... .....VGNEN R.....ELD. .YFKWLDEQP ESS..VLYIS QGSFL.SVSE ugt87a2 T.AYELEHKA I..DAFT.S. ...KL.DI.. .PVYAIGPLI PFE.ELS... .....VQNDN K.....EPN. .YIQWLEEQP EGS..VLYIS QGSFL.SVSE ugt75b2 T.FDSLEPEF L..TAIP... ......NI.. .EMVAVGPLL PAEIFTGSES ......GKDL SRD...HQSS SYTLWLDSKT ESS..VIYVS FGTMV.ELSK ugt75b1 T.FDSLEPEA L..TAFP... ......NI.. .DMVAVGPLL PTEIFSGS.. .......TNK SVK...DQSS SYTLWLDSKT ESS..VIYVS FGTMV.ELSK ugt75d1 T.FQELEPEA M..SSVPD.. ......NF.. .KIVPVGPLL TLRTDF.... .......... ......SSRG EYIEWLDTKA DSS..VLYVS FGTLA.VLSK ugt75c1 T.FSALEHDA L..TSVE... ......KL.. .KMIPIGPLV SSSEGKTD.. .........L FK....SSDE DYTKWLDSKL ERS..VIYIS LGTHADDLPE ugt84a3 T.FRELEKDI M..DHMSQ.. ......LCPQ AIISPVGPLF KMAQTLSSD. .....VKGDI S.....EPAS DCMEWLDSRE PSS..VVYIS FGTIA.NLKQ ugt84a4 T.FQELEKDT I..DHMSQ.. ......LCPQ VNFNPIGPLF TMAKTIRSD. .....IKGDI S.....KPDS DCIEWLDSRE PSS..VVYIS FGTLA.FLKQ ugt84a2 T.FNSLEKDI I..DHMST.. ......LSLP GVIRPLGPLY KMAKTVAYDV .....VKVNI S.....EPTD PCMEWLDSQP VSS..VVYIS FGTVA.YLKQ ugt84a1 S.FDSLEQEV I..DYMSS.. ......LC.. .PVKTVGPLF KVARTVTSD. .....VSGDI C.....KSTD KCLEWLDSRP KSS..VVYIS FGTVA.YLKQ ugt84b1 S.FYELESEI I..ESMAD.. ......LK.. .PVIPIGPLV SPFLLGDGEE ETLDGKNLDF C.....KSDD CCMEWLDKQA RSS..VVYIS FGSML.ETLE ugt84b2 S.FYELESEI I..ESMSD.. ......LK.. .PIIPIGPLV SPFLLGNDEE .....KTLDM W.....KVDD YCMEWLDKQA RSS..VVYIS FGSIL.KSLE ugt74e1 T.FDKLEEKL L..KWIKS.. ......VW.. .PVLNIGPTV PSMYLDKRLA ED.KNYGFSL FGA...KIAE .CMEWLNSKQ PSS..VVYVS FGSLV.VLKK ugt74e2 T.FDKLEEKL L..KWVQS.. ......LW.. .PVLNIGPTV PSMYLDKRLS ED.KNYGFSL FNA...KVAE .CMEWLNSKE PNS..VVYLS FGSLV.ILKE ugt74d1 S.FDELEVEV L..QWMKN.. ......QW.. .PVKNIGPMI PSMYLDKRLA GD.KDYGINL FNA...QVNE .CLDWLDSKP PGS..VIYVS FGSLA.VLKD ugt74c1 T.FDQLEPKV V..KWMND.. ......QW.. .PVKNIGPVV PSKFLDNRLP ED.KDYELEN SKT...EPDE SVLKWLGNRP AKS..VVYVA FGTLV.ALSE ugt74f1 S.FHDLDLHE E..ELLSK.. ......VC.. .PVLTIGPTV PSMYLDQQIK SD.NDYDLNL FDL...KEAA LCTDWLDKRP EGS..VVYIA FGSMA.KLSS ugt74f2 S.FQELELHE N..ELWSK.. ......AC.. .PVLTIGPTI PSIYLDQRIK SD.TGYDLNL FES...KDDS FCINWLDTRP QGS..VVYVA FGSMA.QLTN ugt74b1 G.FEGLEETQ DCENGESD.. ......AM.. .KATLIGPMI PSAYLDDRME DD.KDYGASL LKP...ISKE .CMEWLETKQ AQS..VAFVS FGSFG.ILFE ugt72e2 T.WEEMEPKS L..KSLLNPK LLGRVAR.V. .PVYPIGPLC ...RPIQ... ........SS ......ETDH PVLDWLNEQP NES..VLYIS FGSGG.CLSA ugt72e3 T.WEEMEPKS L..KSLQDPK LLGRVAR.V. .PVYPVGPLC ...RPIQ... ........SS ......TTDH PVFDWLNKQP NES..VLYIS FGSGG.SLTA ugt72e1 T.WDDMEPKT L..KSLQDPK LLGRIAG.V. .PVYPIGPLS ...RPVD... ........PS ......KTNH PVLDWLNKQP DES..VLYIS FGSGG.SLSA

11

ugt72d1 T.WEELQGNT L..AALREDE ELSRVMK.V. .PVYPIGPIV ...RTNQ... ........HV ......DKPN SIFEWLDEQR ERS..VVFVC LGSGG.TLTF ugt72c1 T.WHSLEQVT I..GSFLDPE NLGRVMRGV. .PVYPVGPLV ...RPAE... ........PG ......LK.H GVLDWLDLQP KES..VVYVL LGVVG.ALTF ugt72b1 T.FFELEPNA I..KALQEP. ...GLDK.P. .PVYPVGPLV NIGKQEA... ........KQ ......TEES ECLKWLDNQP LGS..VLYVS FGSGG.TLTC ugt71b7 T.VAELEPYV L..KFLSSS. ......DTP. .PVYPVGPLL HLENQRDDS. .......KD. ......EKRL EIIRWLDQQP PSS..VVFLC FGSMG.GFGE ugt71b8 T.FAELEPYA L..ESLHSSG ......DTP. .RAYPVGPLL HLENHVDGS. .......KD. ......EKGS DILRWLDEQP PKS..VVFLC FGSIG.GFNE ugt71b6 T.VPDLEPQA L..TFLSNG. ......NIP. .RAYPVGPLL HLKNVNCDY. .......VD. ......KKQS EILRWLDEQP PRS..VVFLC FGSMG.GFSE ugt71b5 T.VAELEPHA L..KMFNING D.....DLP. .QVYPVGPVL HLENGNDD.. ........D. ......EKQS EILRWLDEQP SKS..VVFLC FGSLG.GFTE ugt71b2 T.FAELEPQA M..KFFSGVD S.....PLP. .TVYTVGPVM NLKINGPNS. .......SD. ......DKQS EILRWLDEQP RKS..VVFLC FGSMG.GFRE ugt71b1 S.VADMEPQA L..SFFSGGN GN...TNIP. .PVYAVGPIM DLESSGD... ........E. ......EKRK EILHWLKEQP TKS..VVFLC FGSMG.GFSE ugt71c1 S.YTALEPNG F..KYFDRCP D.....NYP. .TIYPIGPIL CSNDRPN... .......LDS ......SERD RIITWLDDQP ESS..VVFLC FGSLK.NLSA ugt71c2 S.FESLERNA F..DYFDRRP D.....NYP. .PVYPIGPIL CSNDRPN... .......LDL ......SERD RILKWLDDQP ESS..VVFLC FGSLK.SLAA ugt71d1 S.SFDIEPYS V..NHFLQ.E Q.....NYP. .SVYAVGPIF DLKAQPHPE. .......QDL ......TRRD ELMKWLDDQP EAS..VVFLC FGSMA.RLRG ugt71d2 T.SFDIEPTS L..NHFLG.E E.....NYP. .SVYAVGPIF NPKAHPHPD. .......QDL ......ACCD ESMKWLDAQP EAS..VVFLC FGSMG.SLRG ugt88a1 T.FDALENRA I..KAITEEL C......FR. .NIYPIGPLI VNGRIEDRN. ........D. ......NKAV SCLNWLDSQP EKS..VVFLC FGSLG.LFSK ugt79b5 T.CVELEGKL C..GFIEKEC ......QK.. .KLLLTGPML PEPQNKS... .......GKF .......LED RWNHWLNGFE PGS..VVFCA FGTQF.FFEK ugt79b4 T.CAEIEGNL C..DFIERQC ......QR.. .KVLLTGPMF LDPQGKS... .......GKP .......LED RWNNWLNGFE PSS..VVYCA FGTHF.FFEI ugt79b7 T.CKEVEGMF C..DFISRQY ......HK.. .KVLLTGPMF PEPDT..... .......SKP .......LEE RWNHFLSGFA PKS..VVFCS PGSQV.ILEK ugt79b8 T.CNEIEGKF C..DYISSQY ......HK.. .KVLLTGPML PEQDT..... .......SKP .......LEE QLSHFLSRFP PRS..VVFCA LGSQI.VLEKugt79b10 T.CKEIEGKF C..EYLERQY ......HK.. .KVFLTGPML PEPNK..... .......GKP .......LED RWSHWLNGFE QGS..VVFCA LGSQV.TLEKugt79b11 T.CEEIEGKF C..DYIESQY ......KK.. .KVLLTGPML PEPDK..... .......SKP .......LED QWSHWLSGFG QGS..VVFCA LGSQT.ILEK ugt79b9 TCKEMIEGKF C..DYIERQF ......QR.. .KVLLTGPML PEPDN..... .......SRP .......LED RWNHWLNQFK PGS..VIYCA LGSQI.TLEK ugt79b6 T.CQEMEGKF C..DFIENQF ......QR.. .KVLLTGPML PEPDN..... .......SKP .......LED QWRQWLSKFD PGS..VIYCA LGSQI.ILEK ugt79b2 T.AREIEGNF C..DYIEKHC ......RK.. .KVLLTGPVF PEPDK..... .......TRE .......LEE RWVKWLSGYE PDS..VVFCA LGSQV.ILEK ugt79b3 T.AREIEGNF C..DYIEKHC ......RK.. .KVLLTGPVF PEPDK..... .......TRE .......LEE RWVKWLSGYE PDS..VVFCA LGSQV.ILEK ugt79b1 T.CRETEGKF C..DYISRQY ......SK.. .PVYLTGPVL PGSQP..... .......NQP S......LDP QWAEWLAKFN HGS..VVFCA FGSQPVVNKI ugt91b1 S.CMELEPEW I..QLLSKLQ ......GK.. .PVIPIGLLP ATPMDDAD.. ......DEG. .......TWL DIREWLDRHQ AKS..VVYVA LGTEV.TISN ugt91a1 S.CYEYEAEW L..GLTQELH ......RK.. .PVIPVGVLP PKPDEKFE.. ......DTD. .......TWL SVKKWLDSRK SKS..IVYVA FGSEA.KPSQ ugt91c1 S.CPEFEPEW F..GLLKDLY ......RK.. .PVFPIGFLP PVIEDDDA.. ......VDT. .......TWV RIKKWLDKQR LNS..VVYVS LGTEA.SLRH ugt89a2 S.FTAMEGVY L..EHLKREM G......HD. .RVWAVGPII PLSGDNR... .......GGP TS....VSVD HVMSWLDARE DNH..VVYVC FGSQV.VLTK

12

ugt83a1 S.VHELETAA F..GLGP... .......... .NIVPIGPIG WAHSLEEGST SLG....... ...SFLPHDR DCLDWLDRQI PGS..VIYVA FGSFGVMGNP ugt78d1 S.FEELEPTL N..YNLRS.. ......KLK. .RFLNIAPLT LLSSTSEK.. ......EMR. .......DPH GCFAWMGKRS AAS..VAYIS FGTVM.EPPP Motif 5 Motif 6

Figure S1 (continued)

13

401 500 ugt73c3 SQLKELGLGL EESRRSFIWV IRGSEK.YK. .......... ...ELFEWML ESGFEERIK. ..ER..GLLI K.GWAPQV.. ..LILSHPSV GGFLTHCGWN ugt73c4 SQLKELGLGL EKSQRSFIWV IRGWEK.YN. .......... ...ELYEWMM ESGFEERIK. ..ER..GLLI K.GWSPQV.. ..LILSHPSV GGFLTHCGWN ugt73c2 AQLRELGLGL EATKRPFIWV IRGGGK.YH. .......... ...ELAEWIL ESGFEERTK. ..ER..SLLI K.GWSPQM.. ..LILSHPAV GGFLTHCGWN ugt73c5 SQLKELGLGL EESQRPFIWV IRGWEK.YK. .......... ...ELVEWFS ESGFEDRIQ. ..DR..GLLI K.GWSPQM.. ..LILSHPSV GGFLTHCGWN ugt73c6 SQLLELGLGL EESQRPFIWV IRGWEK.YK. .......... ...ELVEWFS ESGFEDRIQ. ..DR..GLLI K.GWSPQM.. ..LILSHPSV GGFLTHCGWN ugt73c1 SQLKELGLGL EESQRPFIWV IRGWEK.YN. .......... ...ELLEWIS ESGYKERIK. ..ER..GLLI T.GWSPQM.. ..LILTHPAV GGFLTHCGWN ugt73c7 AQLKELGLGL EASNKPFIWV IREWGK.YG. .......... ...DLANWMQ QSGFEERIK. ..DR..GLVI K.GWAPQV.. ..FILSHASI GGFLTHCGWN ugt73d1 NQLIELGLGL EESGKPFIWV IKTEEK.HMI .......... ...ELDEWLK RENFEERVR. ..GR..GIVI K.GWSPQA.. ..MILSHGST GGFLTHCGWN ugt73b4 EQLLEIAFGL EGSGQNFIWV VSKNEN.QG. .......... ...ENEDWLP .KGFEERNK. ..GK..GLII R.GWAPQV.. ..LILDHKAI GGFVTHCGWN ugt73b5 DQLLEIAFGL EGSGQSFIWV VRKNEN.QG. .......... ...DNEEWLP .EGFKERTT. ..GK..GLII P.GWAPQV.. ..LILDHKAI GGFVTHCGWN ugt73b2 EQLFEIAAGL EASGTSFIWV VRKTK..DD. .......... ....REEWLP .EGFEERVK. ..GK..GMII R.GWAPQV.. ..LILDHQAT GGFVTHCGWN ugt73b3 EQLFEIAAGL ETSGANFIWV VRKNIG.IE. .......... ....KEEWLP .EGFEERVK. ..GK..GMII R.GWAPQV.. ..LILDHQAT CGFVTHCGWN ugt73b1 EQLIEIAAGL DMSGHDFVWV VNRKGS.QE. .......... ...EKEDWLP .EGFEEKTK. ..GK..GLII R.GWAPQV.. ..LILEHKAI GGFLTHCGWN ugt90a2 EQLEEIALGL EESKVNFLWV VK.....G.. .......... ......NEIG .KGFEERVG. ..ER..GMMV RDEWVDQR.. ..KILEHESV RGFLSHCGWN ugt90a1 KQLMELAFGL EDSKVNFLWV TRKDV..E.. .......... ......EIIG .EGFNDRIR. ..ES..GMIV R.DWVDQW.. ..EILSHESV KGFLSHCGWN ugt85a2 KQLVEFAWGL AATGKEFLWV IRPDL..VA. GD........ ...E..AMVP .PEFLTATA. ..DR..RMLA S..WCPQE.. ..KVLSHPAI GGFLTHCGWN ugt85a1 KQLVEFAWGL AGSGKEFLWV IRPDL..VA. GE........ ...E..AMVP .PDFLMETK. ..DR..SMLA S..WCPQE.. ..KVLSHPAI GGFLTHCGWN ugt85a4 EQILEFAWGL ARSGKEFLWV VRSGM..VD. GD........ ...D..SILP .AEFLSETK. ..NR..GMLI K.GWCSQE.. ..KVLSHPAI GGFLTHCGWN ugt76e1 KDMLEMAWGL RNSNQPFLWV IRPGS..IP. GS........ ...EWTESLP .EEFSRLVS. ..ER..GYIV K..WAPQI.. ..EVLRHPAV GGFWSHCGWN ugt76e2 KDMLEMAWGL SNSNQPFLWV VRPGS..IP. GS........ ...EWTESLP .EEFNRLVS. ..ER..GYIV K..WAPQM.. ..EVLRHPAV GGFWSHCGWN ugt76e4 KEMLEMAWGL LNSNQPFLWV IRPGS..VA. GF........ ...EWIELLP .EEVIKMVT. ..ER..GYIA K..WAPQI.. ..EVLGHPAV GGFWSHCGWN ugt76e6 KEVLEMARGL FNSNQPFLWV IRPGS..IA. GS........ ...EWIESLP .EEVIKMVS. ..ER..GYIV K..WAPQI.. ..EVLGHPAV GGFWSHCGWN ugt76e5 KEVLEMAWGL YNSNQPFLWV IRPG...T.. .......... ......ESMP .VEVSKIVS. ..ER..GCIV K..WAPQN.. ..EVLVHPAV GGFWSHCGWN ugt76e3 KEMLEMAWGM LNSNQPFLWV IRPGS..VS. GS........ ...EGIESLP .EEVSKMVL. ..EK..GYIV K..WAPQI.. ..EVLGHPSV GGFWSHCGWN ugt76e9 KEVLEMASGL VSSNQHFLWV IRPGS..IL. GS........ ...ELTN..E .ELLSMMEIP ..DR..GYIV K..WAPQK.. ..QVLAHSAV GAFWSHCGWN ugt76e7 KEMLEMAYGF VSSNQHFLWV IRPGS..IC. GS........ ...EISE..E .ELLKKMVIT ..DR..GYIV K..WAPQK.. ..QVLAHSAV GAFWSHCGWN ugt76e12 NEIMEVASGL AASNQHFLWV IRPGS..IP. GS........ ...EWIESMP .EEFSKMVL. ..DR..GYIV K..WAPQK.. ..EVLSHPAV GGFWSHCGWN ugt76e11 NEVIETALGL DSSKQQFLWV IRPGS..VR. GS........ ...EWIENLP .KEFSKIIS. ..GR..GYIV K..WAPQK.. ..EVLSHPAV GGFWSHCGWN

14

ugt76d1 IEAVEMAMGF VQSNQPFLWV IRPGS..IN. GQ........ ...ESLDFLP .EQFNQTVTD ..GR..GFVV K..WAPQK.. ..EVLRHRAV GGFWNHGGWN ugt76c4 TELMEIAWGL SNSDQPFLWV VRVGS..VN. GT........ ...EWIEAIP .EYFIKRLN. ..EK..GKIV K..WAPQQ.. ..EVLKHRAI GGFLTHNGWN ugt76c3 AEFMEIAWAL RNSDQPFLWV VRGGSV.VH. GA........ ...EWIE... ......QLH. ..EK..GKIV N..WAPQQ.. ..EVLKHQAI GGFLTHNGWN ugt76c5 SDLIEIAWGL RNSDQPFLLV VRVGS..VR. GR........ ...EWIETIP .EEIMEKLN. ..EK..GKIV K..WAPQQ.. ..DVLKHRAI GGFLTHNGWS ugt76c1 SDFLEIACGL RNTNQSFLWV VRPGS..VH. GR........ ...DWIESLP .SGFMESLD. ..GK..GKIV R..WAPQL.. ..DVLAHRAT GGFLTHNGWN ugt86a1 KEIVEIAHGL LLSGISFIWV LRP..D.IV. GS........ ...NVPDFLP .AGFVDQAQ. ..DR..GLVV Q..WCCQM.. ..EVISNPAV GGFFTHCGWN ugt86a2 KDLVEIAHGI LLSKVNFVWV VRP..D.IV. SS........ ...DETNPLP .EGFETEAG. ..DR..GIVI P..WCCQM.. ..TVLSHESV GGFLTHCGWN ugt87a1 AQMEEIVVGV REAGVKFFWV AR.....GG. EL........ ...KLKEALE .GSL...... ......GVVV S..WCDQL.. ..RVLCHAAI GGFWTHCGYN ugt87a2 AQMEEIVKGL RESGVRFLWV AR.....GG. EL........ ...KLKEALE .GSL...... ......GVVV S..WCDQL.. ..RVLCHKAV GGFWTHCGFN ugt75b2 KQIEELARAL IEGGRPFLWV ITDKLN.RE. AKI.....E. ..GE.EETEI .EKIAGFRHE LEEV..GMIV S..WCSQI.. ..EVLRHRAI GCFLTHCGWS ugt75b1 KQIEELARAL IEGKRPFLWV ITDKSN.RE. TKT.....E. ..GE.EETEI .EKIAGFRHE LEEV..GMIV S..WCSQI.. ..EVLSHRAV GCFVTHCGWS ugt75d1 KQLVELCKAL IQSRRPFLWV ITDKSY.RN. K.......E. ..DE.QEKEE .DCISSFREE LDEI..GMVV S..WCDQF.. ..RVLNHRSI GCFVTHCGWN ugt75c1 KHMEALTHGV LATNRPFLWI VREKNP.EE. K......... ...K.KNRFL .ELIRG.... .SDR..GLVV G..WCSQT.. ..AVLAHCAV GCFVTHCGWN ugt84a3 EQMEEIAHGV LSSGLSVLWV VRPPME.GT. FV........ ...E.PHVLP .RELEE.... ...K..GKIV E..WCPQE.. ..RVLAHPAI ACFLSHCGWN ugt84a4 NQIDEIAHGI LNSGLSCLWV LRPPLE.GL. AI........ ...E.PHVLP .LELEE.... ...K..GKIV E..WCQQE.. ..KVLAHPAV ACFLSHCGWN ugt84a2 EQIDEIAYGV LNADVTFLWV IRQQEL.GF. NK........ ...E.KHVLP .EEVK..... ..GK..GKIV E..WCSQE.. ..KVLSHPSV ACFVTHCGWN ugt84a1 EQIEEIAHGV LKSGLSFLWV IRPPPH.DL. KV........ ...E.THVLP .QELKES..S AKGK..GMIV D..WCPQE.. ..QVLSHPSV ACFVTHCGWN ugt84b1 NQVETIAKAL KNRGLPFLWV IRPKEK.AQ. NV........ ...A...VLQ .EMVKE.... ..GQ..GVVL E..WSPQE.. ..KILSHEAI SCFVTHCGWN ugt84b2 NQVETIATAL KNRGVPFLWV IRPKEK.GE. NV........ ...Q...VLQ .EMVKE.... ..GK..GVVT E..WGQQE.. ..KILSHMAI SCFITHCGWN ugt74e1 DQLIELAAGL KQSGHFFLWV VRETER.RK. LP........ ......ENYI .EEIG..... ..EK..GLTV S..WSPQL.. ..EVLTHKSI GCFVTHCGWN ugt74e2 DQMLELAAGL KQSGRFFLWV VRETET.HK. LP........ ......RNYV .EEIG..... ..EK..GLIV S..WSPQL.. ..DVLAHKSI GCFLTHCGWN ugt74d1 DQMIEVAAGL KQTGHNFLWV VRETET.KK. LP........ ......SNYI .EDIC..... ..DK..GLIV N..WSPQL.. ..QVLAHKSI GCFMTHCGWN ugt74c1 KQMKEIAMAI SQTGYHFLWS VRESER.SK. LP........ ......SGFI .EEAE..... ..EKDSGLVA K..WVPQL.. ..EVLAHESI GCFVSHCGWN ugt74f1 EQMEEIASAI SN..FSYLWV VRASEE.SK. LP........ ......PGFL .ETVDK.... ..DK..SLVL K..WSPQL.. ..QVLSNKAI GCFMTHCGWN ugt74f2 VQMEELASAV SN..FSFLWV VRSSEE.EK. LP........ ......SGFL .ETVNK.... ..EK..SLVL K..WSPQL.. ..QVLSNKAI GCFLTHCGWN ugt74b1 KQLAEVAIAL QESDLNFLWV IKEAHI.AK. LP........ ......EGFV .ESTK..... ..DR..ALLV S..WCNQL.. ..EVLAHESI GCFLTHCGWN ugt72e2 KQLTELAWGL EQSQQRFVWV VRPPVDGSCC SEYVSANGGG TEDNTPEYLP .EGFVSRTS. ..DR..GFVV P.SWAPQA.. ..EILSHRAV GGFLTHCGWS ugt72e3 QQLTELAWGL EESQQRFIWV VRPPVDGSSC SDYFSAKGGV TKDNTPEYLP .EGFVTRTC. ..DR..GFMI P.SWAPQA.. ..EILAHQAV GGFLTHCGWS ugt72e1 KQLTELAWGL EMSQQRFVWV VRPPVDGSAC SAYLSANSGK IRDGTPDYLP .EGFVSRTH. ..ER..GFMV S.SWAPQA.. ..EILAHQAV GGFLTHCGWN

15

ugt72d1 EQTVELALGL ELSGQRFVWV LRRPA..S.. ..YLGAISS. DDEQVSASLP .EGFLDRTR. ..GV..GIVV T.QWAPQV.. ..EILSHRSI GGFLSHCGWS ugt72c1 EQTNELAYGL ELTGHRFVWV VRPPAEDDP. SASMFDKTK. NETEPLDFLP .NGFLDRTK. ..DI..GLVV R.TWAPQE.. ..EILAHKST GGFVTHCGWN ugt72b1 EQLNELALGL ADSEQRFLWV IRSPS.GIA. NSSYFD.SH. SQTDPLTFLP .PGFLERTK. ..KR..GFVI P.FWAPQA.. ..QVLAHPST GGFLTHCGWN ugt71b7 EQVREIAIAL ERSGHRFLWS LRRASPNIF. KE....LPG. EFTNLEEVLP .EGFFDRTK. ..DI..GKVI G..WAPQV.. ..AVLANPAI GGFVTHCGWN ugt71b8 EQAREMAIAL ERSGHRFLWS LRRASRDID. KE....LPG. EFKNLEEILP .EGFFDRTK. ..DK..GKVI G..WAPQV.. ..AVLAKPAI GGFVTHCGWN ugt71b6 EQVRETALAL DRSGHRFLWS LRRASPNIL. RE....PPG. EFTNLEEILP .EGFFDRTA. ..NR..GKVI G..WAEQV.. ..AILAKPAI GGFVSHGGWN ugt71b5 EQTRETAVAL DRSGQRFLWC LRHASPNIK. TD....RPR. DYTNLEEVLP .EGFLERTL. ..DR..GKVI G..WAPQV.. ..AVLEKPAI GGFVTHCGWN ugt71b2 GQAKEIAIAL ERSGHRFVWS LRRAQPKG.. .SI..GPPE. EFTNLEEILP .EGFLERTA. ..EI..GKIV G..WAPQS.. ..AILANPAI GGFVSHCGWN ugt71b1 EQAREIAVAL ERSGHRFLWS LRRASPVGN. KSN..PPPG. EFTNLEEILP .KGFLDRTV. ..EI..GKII S..WAPQV.. ..DVLNSPAI GAFVTHCGWN ugt71c1 TQINEIAQAL EIVDCKFIWS FRTN...P.. ........K. EYASPYEALP .HGFMDRVM. ..DQ..GIVC G..WAPQV.. ..EILAHKAV GGFVSHCGWN ugt71c2 SQIKEIAQAL ELVGIRFLWS IRTD...P.. ........K. EYASPNEILP .DGFMNRVM. ..GL..GLVC G..WAPQV.. ..EILAHKAI GGFVSHCGWN ugt71d1 SLVKEIAHGL ELCQYRFLWS LRK....... .......... EEVTKDD.LP .EGFLDRVD. ..GR..GMIC G..WSPQV.. ..EILAHKAV GGFVSHCGWN ugt71d2 PLVKEIAHGL ELCQYRFLWS LRT....... .......... EEVTNDDLLP .EGFMDRVS. ..GR..GMIC G..WSPQV.. ..EILAHKAV GGFVSHCGWN ugt88a1 EQVIEIAVGL EKSGQRFLWV VRNP.PELE. ........K. TELDLKSLLP .EGFLSRTE. ..DK..GMVV K.SWAPQV.. ..PVLNHKAV GGFVTHCGWN ugt79b5 DQFQEFCLGM ELMGLPFLIS VMPPKG.SP. .......... ...TVQEALP .KGFEERVK. ..KH..GIVW E.GWLEQP.. ..LILSHPSV GCFVNHCGFG ugt79b4 DQFQELCLGM ELTGLPFLVA VMPPRG.SS. .......... ...TIQEALP .EGFEERIK. ..GR..GIVW G.GWVEQP.. ..LILSHPSI GCFVNHCGFG ugt79b7 DQFQELCLGM ELTGLPFLLA VKPPRG.SS. .......... ...TVQEGLP .EGFEERVK. ..DR..GVVW G.GWVQQP.. ..LILAHPSI GCFVNHCGPG ugt79b8 DQFQELCLGM ELTGLPFLIA VKPPRG.SS. .......... ...TVEEGLP .EGFQERVK. ..GR..GVVW G.GWVQQP.. ..LILDHPSI GCFVNHCGPG ugt79b10 DQFQELCLGI ELTGLPFFVA VTPPKG.AK. .......... ...TIQDALP .EGFEERVK. ..DR..GVVL G.EWVQQP.. ..LLLAHPSV GCFLSHCGFG ugt79b11 NQFQELCLGI ELTGLPFLVA VKPPKG.AN. .......... ...TIHEALP .EGFEERVK. ..GR..GIVW G.EWVQQPSW QPLILAHPSV GCFVSHCGFG ugt79b9 DQFQELCLGM ELTGLPFLVA VKPPKG.AK. .......... ...TIQEALP .EGFEERVK. ..NH..GVVW G.EWVQQP.. ..LILAHPSV GCFVTHCGFG ugt79b6 DQFQELCLGM ELTGLPFLVA VKPPKG.SS. .......... ...TIQEALP .KGFEERVK. ..AR..GVVW G.GWVQQP.. ..LILAHPSI GCFVSHCGFG ugt79b2 DQFQELCLGM ELTGSPFLVA VKPPRG.SS. .......... ...TIQEALP .EGFEERVK. ..GR..GVVW G.EWVQQP.. ..LLLSHPSV GCFVSHCGFG ugt79b3 DQFQELCLGM ELTGSPFLVA VKPPRG.SS. .......... ...TIQEALP .EGFEERVK. ..GR..GLVW G.GWVQQP.. ..LILSHPSV GCFVSHCGFG ugt79b1 DQFQELCLGL ESTGFPFLVA IKPPSG.VS. .......... ...TVEEALP .EGFKERVQ. ..GR..GVVF G.GWIQQP.. ..LVLNHPSV GCFVSHCGFG ugt91b1 EEIQGLAHGL ELCRLPFFWT LRKR...TRA .......... ...SM..LLP .DGFKERVK. ..ER..GVIW T.EWVPQT.. ..KILSHGSV GGFVTHCGWG ugt91a1 TELNEIALGL ELSGLPFFWV LKTRRG.PWD .......... ...TEPVELP .EGFEERTA. ..DR..GMVW R.GWVEQL.. ..RTLSHDSI GLVLTHPGWG ugt91c1 EEVTELALGL EKSETPFFWV LRNE...PK. .......... ........IP .DGFKTRVK. ..GR..GMVH V.GWVPQV.. ..KILSHESV GGFLTHCGWN ugt89a2 EQTLALASGL EKSGVHFIWA VKEPVE.KD. STR.....G. ...NILDGFD .DRVA..... ..GR..GLVI R.GWAPQV.. ..AVLRHRAV GAFLTHCGWN

16

ugt83a1 QLEELAIGLE LTKRPVLWVT GDQQ...... PI........ ...KLGSDR. .......... ......VKVV R..WAPQR.. ..EVLSSGAI GCFVSHCGWN ugt78d1 EELVAIAQGL ESSKVPFVWS LKE....KN. MVH....... ....LPKGFL .DRTR..... ..EQ..GIVV P..WAPQV.. ..ELLKHEAM GVNVTHCGWN Motif 7

Figure S1 (continued)

17

501 600 ugt73c3 STLEGITSGI PLITWPLFGD QFCNQKLVVQ VLKAGVSAGV EEVMKWGEED KIGVLVDKEG VKKAVEELMG D.......SD DAKE...RRR RVK.ELGELA ugt73c4 STLEGITSGI PLITWPLFGD QFCNQKLVVQ VLKAGVSAGV EEVMKWGEEE KIGVLVDKEG VKKAVEELMG A.......SD DAKE...RRR RVK.ELGESA ugt73c2 STLEGITSGV PLITWPLFGD QFCNQKLIVQ VLKAGVSVGV EEVMKWGEEE SIGVLVDKEG VKKAVDEIMG E.......SD EAKE...RRK RVR.ELGELA ugt73c5 STLEGITAGL PLLTWPLFAD QFCNEKLVVE VLKAGVRSGV EQPMKWGEEE KIGVLVDKEG VKKAVEELMG E.......SD DAKE...RRR RAK.ELGDSA ugt73c6 STLEGITAGL PMLTWPLFAD QFCNEKLVVQ ILKVGVSAEV KEVMKWGEEE KIGVLVDKEG VKKAVEELMG E.......SD DAKE...RRR RAK.ELGESA ugt73c1 STLEGITSGV PLLTWPLFGD QFCNEKLAVQ ILKAGVRAGV EESMRWGEEE KIGVLVDKEG VKKAVEELMG D.......SN DAKE...RRK RVK.ELGELA ugt73c7 STLEGITAGV PLLTWPLFAE QFLNEKLVVQ ILKAGLKIGV EKLMKYGKEE EIGAMVSREC VRKAVDELMG D.......SE EAEE...RRR KVT.ELSDLA ugt73d1 STIEAICFGV PMITWPLFAE QFLNEKLIVE VLNIGVRVGV EIPVRWGDEE RLGVLVKKPS VVKAIKLLMD QDCQRVDEND DDNEFVRRRR RIQ.ELAVMA ugt73b4 STLEGIAAGL PMVTWPMGAE QFYNEKLLTK VLRIGVNVGA TELVK..... .KGKLISRAQ VEKAVREVIG G........E KAEE...RRL RAK.ELGEMA ugt73b5 SAIEGIAAGL PMVTWPMGAE QFYNEKLLTK VLRIGVNVGA TELVK..... .KGKLISRAQ VEKAVREVIG G........E KAEE...RRL WAK.KLGEMA ugt73b2 SLLEGVAAGL PMVTWPVGAE QFYNEKLVTQ VLRTGVSVGA SKHMKV.... MMGDFISREK VDKAVREVLA G........E AAEE...RRR RAK.KLAAMA ugt73b3 SLLEGVAAGL PMVTWPVAAE QFYNEKLVTQ VLRTGVSVGA KKNVR..... TTGDFISREK VVKAVREVLV G........E EADE...RRE RAK.KLAEMA ugt73b1 SLLEGVAAGL PMVTWPVGAE QFYNEKLVTQ VLKTGVSVGV KKMMQ..... VVGDFISREK VEGAVREVMV G........E ...E...RRK RAK.ELAEMA ugt90a2 SLTESICSEV PILAFPLAAE QPLNAILVVE ELRVAERVVA AS........ ..EGVVRREE IAEKVKELME G........E KGKE...LRR NVE.AYGKMA ugt90a1 SAQESICVGV PLLAWPMMAE QPLNAKMVVE EIKVGVRVET EDGSV..... ..KGFVTREE LSGKIKELME G........E TGKT...ARK NVK.EYSKMA ugt85a2 STLESLCGGV PMVCWPFFAE QQTNCKFSRD EWEVGIEIGG D......... .....VKREE VEAVVRELMD E........E KGKN...MRE KAE.EWRRLA ugt85a1 SILESLSCGV PMVCWPFFAD QQMNCKFCCD EWDVGIEIGG D......... .....VKREE VEAVVRELMD G........E KGKK...MRE KAV.EWQRLA ugt85a4 STLESLYAGV PMICWPFFAD QLTNRKFCCE DWGIGMEIGE E......... .....VKRER VETVVKELMD G........E KGKR...LRE KVV.EWRRLA ugt76e1 STLESIGEGV PMICRPFTGD QKVNARYLER VWRIGVQLEG E......... .....LDKGT VERAVERLIM D.......EE .GAE...MRK RVI.NLKEKL ugt76e2 STVESIGEGV PMICRPFTGD QKVNARYLER VWRIGVQLEG D......... .....LDKET VERAVEWLLV D.......EE .GAE...MRK RAI.DLKEKI ugt76e4 STLESIVEGV PMICRPLQGE QKLNAMYIES VWKIGIQLEG E......... .....VEREG VERAVKRLII D.......EE .GAA...MRE RAL.DLKEKL ugt76e6 STLESIVEGV PMICRPFHGE QKLNALCLES IWRIGFQVQG K......... .....VERGG VERAVKRLIV D.......EE .GAD...MRE RAL.VLKENL ugt76e5 STLESIVEGV PMICRPFNGE QKLNAMYIES VWRVGVLLQG E......... .....VERGC VERAVKRLIV D.......DE .GVG...MRE RAL.VLKEKL ugt76e3 STLESIVEGV PMICRPYQGE QMLNAIYLES VWRIGIQVGG E......... .....LERGA VERAVKRLIV D.......KE .GAS...MRE RTL.VLKEKL ugt76e9 STLESMGEGV PMICRPFTTD QKVNARYVEC VWRVGVQVEG E......... .....LKRGV VERAVKRLLV D.......EE .GEE...MKL RAL.SLKEKL ugt76e7 STLESLGEGV PLICRPFTTD QKGNARYLEC VWKVGIQVEG E......... .....LERGA IERAVKRLMV D.......EE .GEE...MKR RAL.SLKEKLugt76e12 STLESIGQGV PMICRPFSGD QKVNARYLEC VWKIGIQVEG E......... .....LDRGV VERAVKRLMV D.......EE .GEE...MRK RAF.SLKEQLugt76e11 STLESIGEGV PMICKPFSSD QMVNARYLEC VWKIGIQVEG D......... .....LDRGA VERAVRRLMV E.......EE .GEG...MRK RAI.SLKEQL

18

ugt76d1 SCLESISSGV PMICRPYSGD QRVNTRLMSH VWQTAYEIEG E......... .....LERGA VEMAVRRLIV D.......QE .GQE...MRM RAT.ILKEEV ugt76c4 STVESVCEGV PMICLPFRWD QLLNARFVSD VWMVGIHLEG R......... .....IERDE IERAIRRLLL E.......TE .GEA...IRE RIQ.LLKEKV ugt76c3 STVESVFEGV PMICMPFVWD QLLNARFVSD VWMVGLHLEG R......... .....IERNV IEGMIRRLFS E.......TE .GKA...IRE RME.ILKENV ugt76c5 STVESVCEAV PMICLPFRWD QMLNARFVSD VWMVGINLED R......... .....VERNE IEGAIRRLLV E.......PE .GEA...IRE RIE.HLKEKV ugt76c1 STLESICEGV PMICLPCKWD QFVNARFISE VWRVGIHLEG R......... .....IERRE IERAVIRLMV E.......SK .GEE...IRG RIK.VLRDEV ugt86a1 SILESVWCGL PLLCYPLLTD QFTNRKLVVD DWCIGINLCE KKT....... .....ITRDQ VSANVKRLMN G........E TSSE...LRN NVE.KVKRHL ugt86a2 SILETIWCEV PVLCFPLLTD QVTNRKLVVD DWEIGINLCE DKSD...... .....FGRDE VGRNINRLMC G......... ..V....SKE KIG.RVKMSL ugt87a1 STLEGICSGV PLLTFPVFWD QFLNAKMIVE EWRVGMGIER KKQME..... ...LLIVSDE IKELVKRFMD GE......SE EGKE...MRR RTC.DLSEIC ugt87a2 STLEGIYSGV PMLAFPLFWD QILNAKMIVE DWRVGMRIER TKKNE..... ...LLIGREE IKEVVKRFMD RE......SE EGKE...MRR RAC.DLSEIS ugt75b2 SSLESLVLGV PVVAFPMWSD QPANAKLLEE IWKTGVRVRE NSEG...... ....LVERGE IMRCLEAVME .........A KSVE...LRE NAE.KWKRLA ugt75b1 STLESLVLGV PVVAFPMWSD QPTNAKLLEE SWKTGVRVRE NKDG...... ....LVERGE IRRCLEAVME .........E KSVE...LRE NAK.KWKRLA ugt75d1 STLESLVSGV PVVAFPQWND QMMNAKLLED CWKTGVRVME KKEEEG.... ..VVVVDSEE IRRCIEEVME .........D KAEE...FRG NAT.RWKDLA ugt75c1 STLESLESGV PVVAFPQFAD QCTTAKLVED TWRIGVKVKV GEEGD..... .....VDGEE IRRCLEKVMS G.......GE EAEE...MRE NAE.KWKAMA ugt84a3 STMEALTAGV PVVCFPQWGD QVTDAVYLAD VFKTGVRLGR GAAEE..... ...MIVSREV VAEKLLEATV G........E KAVE...LRE NAR.RWKAEA ugt84a4 STMEALTSGV PVICFPQWGD QVTNAVYMID VFKTGLRLSR GASDE..... ...RIVPREE VAERLLEATV G........E KAVE...LRE NAR.RWKEEA ugt84a2 STMEAVSSGV PTVCFPQWGD QVTDAVYMID VWKTGVRLSR GEAEE..... ...RLVPREE VAERLREVTK G........E KAIE...LKK NAL.KWKEEA ugt84a1 STMESLSSGV PVVCCPQWGD QVTDAVYLID VFKTGVRLGR GATEE..... ...RVVPREE VAEKLLEATV G........E KAEE...LRK NAL.KWKAEA ugt84b1 STMETVVAGV PVVAYPSWTD QPIDARLLVD VFGIGVRMRN DSVDG..... ....ELKVEE VERCIEAVTE G........P AAVD...IRR RAA.ELKRVA ugt84b2 STIETVVTGV PVVAYPTWID QPLDARLLVD VFGIGVRMKN DAIDG..... ....ELKVAE VERCIEAVTE G........P AAAD...MRR RAT.ELKHAA ugt74e1 STLEGLSLGV PMIGMPHWAD QPTNAKFMED VWKVGVRVKA DSDG...... ....FVRREE FVRRVEEVME ........AE QGKE...IRK NAE.KWKVLA ugt74e2 STLEGLSLGV PMIGMPHWTD QPTNAKFMQD VWKVGVRVKA EGDG...... ....FVRREE IMRSVEEVME ........GE KGKE...IRK NAE.KWKVLA ugt74d1 STLEALSLGV ALIGMPAYSD QPTNAKFIED VWKVGVRVKA DQNG...... ....FVPKEE IVRCVGEVME DM......SE KGKE...IRK NAR.RLMEFA ugt74c1 STLEALCLGV PMVGVPQWTD QPTNAKFIED VWKIGVRVRT DGEG...... ....LSSKEE IARCIVEVME ........GE RGKE...IRK NVE.KLKVLA ugt74f1 STMEGLSLGV PMVAMPQWTD QPMNAKYIQD VWKVGVRVKA EKESG..... ....ICKREE IEFSIKEVME ........GE KSKE...MKE NAG.KWRDLA ugt74f2 STMEALTFGV PMVAMPQWTD QPMNAKYIQD VWKAGVRVKT EKESG..... ....IAKREE IEFSIKEVME ........GE RSKE...MKK NVK.KWRDLA ugt74b1 STLEGLSLGV PMVGVPQWSD QMNDAKFVEE VWKVGYRAKE EAGEV..... ....IVKSEE LVRCLKGVME ........GE SSVK...IRE SSK.KWKDLA ugt72e2 STLESVVGGV PMIAWPLFAE QNMNAALLSD ELGIAVRLDD .PKE...... ....DISRWK IEALVRKVMT E........K EGEA...MRR KVK.KLRDSA ugt72e3 STLESVLCGV PMIAWPLFAE QNMNAALLSD ELGISVRVDD .PKE...... ....AISRSK IEAMVRKVMA E........D EGEE...MRR KVK.KLRDTA ugt72e1 SILESVVGGV PMIAWPLFAE QMMNATLLNE ELGVAVRSKK LPSE...... ...GVITRAE IEALVRKIMV E........E EGAE...MRK KIK.KLKETA

19

ugt72d1 SALESLTKGV PIIAWPLYAE QWMNATLLTE EIGVAVRTSE LPSER..... ....VIGREE VASLVRKIMA EE......DE EGQK...IRA KAE.EVRVSS ugt72c1 SVLESIVNGV PMVAWPLYSE QKMNARMVSG ELKIALQINV ADG....... ....IVKKEV IAEMVKRVMD E........E EGKE...MRK NVK.ELKKTA ugt72b1 STLESVVSGI PLIAWPLYAE QKMNAVLLSE DIRAALRPRA GDDG...... ....LVRREE VARVVKGLME G........E EGKG...VRN KMK.ELKEAA ugt71b7 STLESLWFGV PTAAWPLYAE QKFNAFLMVE ELGLAVEIRK YWRGEH.LAG LPTATVTAEE IEKAIMCLME Q........D ..SD...VRK RVK.DMSEKC ugt71b8 SILESLWFGV PIAPWPLYAE QKFNAFVMVE ELGLAVKIRK YWRGDQ.LVG TATVIVTAEE IERGIRCLME Q........D ..SD...VRN RVK.EMSKKC ugt71b6 STLESLWFGV PMAIWPLYAE QKFNAFEMVE ELGLAVEIKK HWRGDL.LLG R.SEIVTAEE IEKGIICLME Q........D ..SD...VRK RVN.EISEKC ugt71b5 SILESLWFGV PMVTWPLYAE QKVNAFEMVE ELGLAVEIRK YLKGDL.FAG E.METVTAED IERAIRRVME Q........D ..SD...VRN NVK.EMAEKC ugt71b2 STLESLWFGV PMATWPLYAE QQVNAFEMVE ELGLAVEVRN SFRGDF.MAA D.DELMTAEE IERGIRCLME Q........D ..SD...VRS RVK.EMSEKS ugt71b1 SILESLWFGV PMAAWPIYAE QQFNAFHMVD ELGLAAEVKK EYRRDF.LVE E.PEIVTADE IERGIKCAME Q........D ..SK...MRK RVM.EMKDKL ugt71c1 SILESLGFGV PIATWPMYAE QQLNAFTMVK ELGLALEMRL DY......VS EDGDIVKADE IAGTVRSLMD G......... ..VDV..PKS KVK.EIAEAG ugt71c2 SILESLRFGV PIATWPMYAE QQLNAFTIVK ELGLALEMRL DY......VS EYGEIVKADE IAGAVRSLMD G......... ..EDV..PRR KLK.EIAEAG ugt71d1 SIVESLWFGV PIVTWPMYAE QQLNAFLMVK ELKLAVELKL DYR.....VH S.DEIVNANE IETAIRYVMD T........D ..NNV..VRK RVM.DISQMI ugt71d2 SIVESLWFGV PIVTWPMYAE QQLNAFLMVK ELKLAVELKL DYS.....VH S.GEIVSANE IETAISCVMN K........D ..NNV..VRK RVM.DISQMI ugt88a1 SILEAVCAGV PMVAWPLYAE QRFNRVMIVD EIKIAISMNE SETG...... ....FVSSTE VEKRVQEIIG E......... .CP....VRE RTM.AMKNAA ugt79b5 SMWESLVSDC QIVFIPQLAD QVLITRLLTE ELEVSVKVQR EDS....... ...GWFSKED LRDTVKSVMD ID......SE IGNL...VKR NHK.KLKETL ugt79b4 SMWESLVSDC QIVFIPQLVD QVLTTRLLTE ELEVSVKVKR DEIT...... ...GWFSKES LRDTVKSVMD KN......SE IGNL...VRR NHK.KLKETL ugt79b7 TIWESLVSDC QMVLIPFLSD QVLFTRLMTE EFEVSVEVPR EKT....... ...GWFSKES LSNAIKSVMD KD......SD IGKL...VRS NHT.KLKEIL ugt79b8 TIWECLMTDC QMVLLPFLGD QVLFTRLMTE EFKVSVEVSR EKT....... ...GWFSKES LSDAIKSVMD KD......SD LGKL...VRS NHA.KLKETLugt79b10 SMWESIMSDC QIVLLPFLAD QVLNTRLMTE ELKVSVEVQR EET....... ...GWFSKES LSVAITSVMD QA......SE IGNL...VRR NHS.KLKEVLugt79b11 SMWESLMSDC QIVFIPVLND QVLTTRVMTE ELEVSVEVQR EET....... ...GWFSKEN LSGAIMSLMD QD......SE IGNQ...VRR NHS.KLKETL ugt79b9 SMWESLVSDC QIVLLPYLCD QILNTRLMSE ELEVSVEVKR EET....... ...GWFSKES LSVAITSVMD KD......SE LGNL...VRR NHA.KLKEVL ugt79b6 SMWEALVNDC QIVFIPHLGE QILNTRLMSE ELKVSVEVKR EET....... ...GWFSKES LSGAVRSVMD RD......SE LGNW...ARR NHV.KWKESL ugt79b2 SMWESLLSDC QIVLVPQLGD QVLNTRLLSD ELKVSVEVAR EET....... ...GWFSKES LFDAINSVMK RD......SE IGNL...VKK NHT.KWRETL ugt79b3 SMWESLLSDC QIVLVPQLGD QVLNTRLLSD ELKVSVEVAR EET....... ...GWFSKES LCDAVNSVMK RD......SE LGNL...VRK NHT.KWRETV ugt79b1 SMWESLMSDC QIVLVPQHGE QILNARLMTE EMEVAVEVER EKK....... ...GWFSRQS LENAVKSVME EG......SE IGEK...VRK NHD.KWRCVL ugt91b1 SAVEGLSFGV PLIMFPCNLD QPLVARLLSG M.NIGLEIPR NERD...... ...GLFTSAS VAETIRHVVV EE......E. .GKI...YRN NAASQQKKIF ugt91a1 TIIEAIRFAK PMAMLVFVYD QGLNARVIEE K.KIGYMIPR DETE...... ...GFFTKES VANSLRLVMV EE......E. .GKV...YRE NVK.EMKGVF ugt91c1 SVVEGLGFGK VPIFFPVLNE QGLNTRLLHG K.GLGVEVSR DERD...... ...GSFDSDS VADSIRLVMI D........D AGEE...IRA KAK.VMKDLF ugt89a2 SVVEAVVAGV LMLTWPMRAD QYTDASLVVD ELKVGVRACE GPDT...... ....VPDPDE LARVFADSVT G......... .NQT...ERI KAV.ELRKAA

20

ugt83a1 STLEGAQNGI PFLCIPYFAD QFINKAYICD VWKIGLGLER DARG...... ....VVPRLE VKKKIDEIMR .......... DGGE...YEE RAM.KVKEIV ugt78d1 SVLESVSAGV PMIGRPILAD NRLNGRAVEV VWKVGVMMDN G......... ....VFTKEG FEKCLNDVFV H........D DGKT...MKA NAK.KLKEKL Motif 8 Motif 9

Figure S1 (continued)

21

601 ugt73c3 HKAVEKG.G. SSHSNITLLL Q...DIMQLA QF........ .......... .KN.. ugt73c4 HKAVEEG.G. SSHSNITYLL Q...DIMQQV KS........ .......... .KN.. ugt73c2 HKAVEEG.G. SSHSNIIFLL Q...DIMQQV ES........ .......... .KS.. ugt73c5 HKAVEEG.G. SSHSNISFLL Q...DIMELA EP........ .......... .NN.. ugt73c6 HKAVEEG.G. SSHSNITFLL Q...DIMQLA QS........ .......... .NN.. ugt73c1 HKAVEEG.G. SSHSNITFLL Q...DIMQLE QP........ .......... .KK.. ugt73c7 NKALEKG.G. SSDSNITLLI Q...DIMEQS QN........ .......... .QF.. ugt73d1 KKAVEEK.G. SSSINVSILI Q...DVLEQL S......... .......... .LV.. ugt73b4 KAAVEEG.G. SSYNDVNKFM E...ELNGRK .......... .......... ..... ugt73b5 KAAVEEG.G. SSYNDVNKFM E...ELNGRK .......... .......... ..... ugt73b2 KAAVEEG.G. SSFNDLNSFM E...EFSS.. .......... .......... ..... ugt73b3 KAAVE.G.G. SSFNDLNSFI E...EFTS.. .......... .......... ..... ugt73b1 KNAVKEG.G. SSDLEVDRLM E...ELTLVK LQKE...... .......... .KV.. ugt90a2 KKALEEGIG. SSRKNLDNLI N...EFCN.. .......... .......... NGT.. ugt90a1 KAALVEGTG. SSWKNLDMIL K...ELCKS. .RD....S.. .......... NGASE ugt85a2 NEATEHKHG. SSKLNFEMLV N...KVLLG. .E........ .......... ..... ugt85a1 EKATEHKLG. SSVMNFETVV S...KFLLG. .QK....S.. .......... QD... ugt85a4 EEASAPPLG. SSYVNFETVV N...KVLTC. .HT....I.. .......... RST.. ugt76e1 QASVKSR.G. SSFSSLDNFV N...SLKMMN .......... .......... .FM.. ugt76e2 ETSVRSG.G. SSCSSLDDFV N...S..... .......... .......... ..M.. ugt76e4 NASVRSG.G. SSYNALDELV K...FLNTE. .......... .......... ..... ugt76e6 KASVRNG.G. SSYNALEEIV N...LM.... .......... .......... ..... ugt76e5 NASVRSG.G. SSYNALDELV H...YLEAEY .......... .......... .RN.. ugt76e3 KASIRGG.G. SSCNALDELV K...HLKTE. .......... .......... ..... ugt76e9 KVSVLPG.G. SSHSSLDDLI K...T..... .......... .......... ..L.. ugt76e7 KASVLAQ.G. SSHKSLDDFI K...T..... .......... .......... ..L.. ugt76e12 RASVKSG.G. SSHNSLEEFV H...FIRT.. .......... .......... ..L.. ugt76e11 RASVISG.G. SSHNSLEEFV H...YMRT.. .......... .......... ..L.. ugt76d1 EASVTTE.G. SSHNSLNNLV H...AIMMQI D......... .......... .EQ.. ugt76c4 GRSVKQN.G. SAYQSLQNLI N...YIS... .......... .......... .SF.. ugt76c3 GRSVKPK.G. SAYRSLQHLI D...YIT... .......... .......... .YF.. ugt76c5 GRSFQQN.G. SAYQSLQNLI D...YIS... .......... .......... .SF.. ugt76c1 RRSVKQG.G. SSYRSLDELV D...RISIII EPLV...... .......... .PT.. ugt86a1 KDAVTTV.G. SSETNFNLFV S...EVRNRI ETKLCNVNG. .....LEI.S PSN.. ugt86a2 EGAVRNS.GS SSEMNLGLFI D...GLLSKV GLS....... .......... .NGKA ugt87a1 RGAVAKG.G. SSDANIDAFI K...DITK.. .......... .......... .IV.. ugt87a2 RGAVAKS.G. SSNVNIDEFV R...HITN.. .......... .......... .TN.. ugt75b2 TEAGREG.G. SSDKNVEAFV K...SLF... .......... .......... ..... ugt75b1 MEAGREG.G. SSDKNMEAFV E...DICGES LIQNLCEAE. ......E..V KVK.. ugt75d1 AEAVREG.G. SSFNHLKAFV D...EHM... .......... .......... ..... ugt75c1 VDAAAEG.G. PSDLNLKGFV D......... ......EDE. .......... ..... ugt84a3 EAAVADG.G. SSDMNFKEFV D...KLVTKH VTR...EDN. .......... GEH.. ugt84a4 ESAVAYG.G. TSERNFQEFV D...KLVDVK TMT...NIN. .......... NVV.. ugt84a2 EAAVARG.G. SSDRNLEKFV E...KLGAKP VGK...VQNG SHNHVLAGSI KSF.. ugt84a1 EAAVAPG.G. SSDKNFREFV E...KLGAG. VTK...TKD. .......... NGY.. ugt84b1 RLALAPG.G. SSTRNLDLFI S...DIT... .......... .......... .IA.. ugt84b2 RSAMSPG.G. SSAQNLDSFI S...DIP... .......... .......... .IT.. ugt74e1 QEAVSEG.G. SSDKNINEFV S....MFC.. .......... .......... ..... ugt74e2 QEAVSEG.G. SSDKSINEFV S....MFC.. .......... .......... ..... ugt74d1 REALSDG.G. NSDKNIDEFV A...KIVR.. .......... .......... ..... ugt74c1 REAISEG.G. SSDKKIDEFV A....LLT.. .......... .......... ..... ugt74f1 VKSLSEG.G. STDININEFV S...KIQIK. .......... .......... ..... ugt74f2 VKSLNEG.G. STDTNIDTFV S...RVQSK. .......... .......... ..... ugt74b1 VKAMSEG.G. SSDRSINEFI E...SLG.K. .......... .......... ..... ugt72e2 EMSLSIDGGG LAHESLCRVT KECQRFLERV VDL....S.. .......... RGA.. ugt72e3 EMSLSIHGGG SAHESLCRVT KECQRFLECV GDL....G.. .......... RGA.. ugt72e1 AESLSCDGG. VAHESLSRIA DESEHLLERV RCM....A.. .......... RGA..

22

ugt72d1 ERAWSKD.G. SSYNSLFEWA KR.CYLVP.. .......... .......... ..... ugt72c1 EEALNMT... ..HIPSAYFT .......... .......... .......... ..... ugt72b1 CRVLKDD.G. TSTKALSLVA LKWKAHKKEL EQ........ .......... .NGNH ugt71b7 HVALMDG.G. SSRTALQKFI E...EVAKNI VSLDKEFEHV .......... ALK.. ugt71b8 HMALKDG.G. SSQSALKLFI Q...DVTKYI .......... .......... ..A.. ugt71b6 HVALMDG.G. SSETALKRFI Q...DVTENI A......WSE .......... TES.. ugt71b5 HFALMDG.G. SSKAALEKFI Q...DVIENM .......... .......... ..D.. ugt71b2 HVALMDG.G. SSHVALLKFI Q...DVTKNI .......... .......... ..S.. ugt71b1 HVALVDG.G. SSNCALKKFV Q...DVVDNV .......... .......... ..P.. ugt71c1 KEAV.DG.G. SSFLAVKRFI G...DLIDGV S......... .......... ISK.. ugt71c2 KEAVMDG.G. SSFVAVKRFI .......DGL .......... .......... ..... ugt71d1 QRATKNG.G. SSFAAIEKFI Y...DVIGIK .......... .......... ..P.. ugt71d2 QRATKNG.G. SSFAAIEKFI H...DVIGTR .......... .......... ..T.. ugt88a1 ELALTET.G. SSHTALTTLL Q...SWSPK. .......... .......... ..... ugt79b5 VSPGLLS.G. YADKFVEALE I...EVNNT. KF........ .......... ..S.. ugt79b4 VSPGLLS.S. YADKFVDELE N...HIHSK. N......... .......... ..... ugt79b7 VSPGLLT.G. YVDHFVEGLQ E...NLI... .......... .......... ..... ugt79b8 GSHGLLT.G. YVDKFVEELQ E...YLI... .......... .......... ..... ugt79b10 VSDGLLT.G. YTDKFVDTLE N...LVSET. KR........ .......... ..E.. ugt79b11 ASPGLLT.G. YTDKFVDTLE N...LVNEQG YI........ .......... ..S.. ugt79b9 VSPGLLT.G. YTDEFVETLQ N...IVNDT. NL........ .......... ..E.. ugt79b6 LRHGLMS.G. YLNKFVEALE K...LVQNIN .......... .......... ..LE. ugt79b2 TSPGLVT.G. YVDNFIESLQ D...LVSGTN HV........ .......... ..SK. ugt79b3 ASPGLMT.G. YVDAFVESLQ D...LVSGTT H......... .......... ...D. ugt79b1 TDSGFSD.G. YIDKFEQNLI E...LVKS.. .......... .......... ..... ugt91b1 GNKRLQD.Q. YADGFIEFLE N...PIAGV. .......... .......... ..... ugt91a1 GDMDRQD.R. YVDSFLEYLV TNR....... .......... .......... ..... ugt91c1 GNMDENI.R. YVDELVRFMR SKGSSSSS.. .......... .......... ..... ugt89a2 LDAIQER.G. SSVNDLDGFI Q...HVVSLG .......... .......... ..LNK ugt83a1 MKSVAKD.G. ISCENLNKFV N...WIKSQV N......... .......... ..... ugt78d1 QEDFSMK.G. SSLENFKILL D...EIVKV. .......... .......... .....

Fi S1

TABLE S1

Putative UGT genes of Arabidopsis (data by 1/3/2000)

In this nomenclature system, a gene name includes: (a) UGT, defining a putative UDP-

dependent glycosyltransferase; (b) an Arabic number from 71-100, which designates a UGT family of

plant origin, the same number being used for sequences with >45% sequence identity; (c) a letter,

representing a subfamily whose members share >60% identity and (d) a number, corresponding to the

individual gene. The letter ‘P’ after the gene number is used to denote a pseudogene.An asterisk

denotes a partial sequence due to location of the UGT sequence at the terminus of a BAC clone. ORF

positions in BAC clone sequences are subjected to changes due to sequence up-dates in databases.

23

UGT name database

accession

number

ORF positions

in BAC clone

sequences

chromosome nearest RI

markers (cM)

UGT name database

accession

number

ORF positions

in BAC clone

sequences

chromosome nearest RI

markers (cM)

UGT71B1 ab025634 21447-22868 III mi142 (29) UGT76D1 ac002505 49628-51237 II B68 (50)

UGT71B2 ab025634 23981-25438 III mi142 (29) UGT76E1 ab025604 71158-69708 V g3791 (114)

UGT71B3P z97338 82500-81440 IV mi198 (52) UGT76E10P al133314 86914-85463 III ASN1 (61)

UGT71B4P z97338 78350-79800 IV mi198 (52) UGT76E11 al133314 93329-91897 III ASN1 (61)

UGT71B5 z97338 83825-85261 IV mi198 (52) UGT76E12 al133314 89949-88508 III ASN1 (61)

UGT71B6 ab025634 33372-31933 III mi142 (29) UGT76E2 ab025604 74054-72621 V g3791 (114)

UGT71B7 ab025634 35296-33809 III mi142 (29) UGT76E3 al096859 92466-93884 III ASN1 (61)

UGT71B8 ab025634 38567-37125 III mi142 (29) UGT76E4 al096859 95117-96554 III ASN1 (61)

UGT71C1 ac005496 38698-37253 II mi54 (56) UGT76E5 al096859 81966-83381 III ASN1 (61)

UGT71C2 ac005496 41853-40429 II mi54 (56) UGT76E6 al133314 95845-94420 III ASN1 (61)

UGT71D1 ac005496 44948-43545 II mi54 (56) UGT76E7 ab028606 34184-35615 V RBCSB (80)

UGT71D2 ac005496 49883-48480 II mi54 (56) UGT76E8P ab012241 52250-54750 V RBCSB (80)

UGT72B1 af007269 110681-

112123

IV 141G (6) UGT76E9 Ab028606 7449-9225 V RBCSB (80)

UGT72C1 z99708 37826-39199 IV FAH1 (86) UGT78D1 ac009917 65767-67224 I ve009 (49)

UGT72D1 ac006135 23014-21602 II mi139 (35) UGT79B1 ab018115 4206-2800 V m435 (108)

UGT72D2P ac006135 27050-25600 II mi139 (35) UGT79B10 ac006193 80088-78745 I mi424 (90)

UGT72E1 al049862 5448-3985 III AtEm1 (72) UGT79B11 ac006193 82219-80861 I mi424 (90)

UGT72E2 ab018119 30560-32005 V mi335 (131) UGT79B2 al035602 11254-9887 IV RLK5 (76)

UGT72E3 af077407 79017-80462 V Tn139 (59) UGT79B3 al035602 14791-13430 IV RLK5 (76)

UGT73B1 al021961 40726-42317 IV pCITd104 (83) UGT79B4 ap000606 64813-63467 III mi413 (50)

UGT73

B2

al021

961

43568-

45108

IV pCITd10

4 (83)

UGT79

B5

ac012

561

46196-

45850

I mi106

(72)

UGT73

B3

al021

961

45871-

47316

IV pCITd10

4 (83)

UGT79

B6

ab007

644

60918-

59557

V m435

(108)

UGT73

B4

ac006

248

10348-

11996

II mi398

(29)

UGT79

B7

ac006

567

33440-

32112

IV nga8

(26)

UGT73

B5

ac006

248

7415-

9050

II mi398

(29)

UGT79

B8

ac004

786

55469-

56797

II mi238

(39)

24

UGT73

C1

ac006

282

63378-

61903

II ve018

(69)

UGT79

B9

ab007

644

56788-

55445

V m435

(108)

UGT73

C2

ac006

282

65904-

64414

II ve018

(69)

UGT83

A1

ac011

664

40497-

38853

III mi74b

(5)

UGT73

C3

ac006

282

70480-

68990

II ve018

(69)

UGT84

A1

z9733

9

19099-

17645

IV mi260

(55)

UGT73

C4

ac006

282

68089-

66599

II ve018

(69)

UGT84

A2

ab019

232

33583-

35073

III mi142

(29)

UGT73

C5

ac006

282

76352-

74865

II ve018

(69)

UGT84

A3

z9733

9

22948-

21509

IV mi260

(55)

UGT73

C6

ac006

282

73198-

71711

II ve018

(69)

UGT84

A4

z9733

9

27167-

25740

IV mi260

(55)

UGT73

C7

al132

958

28025-

26553

III AFC1

(74)

UGT84

B1

ac002

391

50021-

51391

II mi238

(39)

UGT73

D1

al132

958

23327-

21804

III AFC1

(74)

UGT84

B2

ac002

391

52276-

53628

II mi238

(39)

UGT74

B1

ac002

396

6322-

4859

I mi163

(35)

UGT84

B3P

ac002

391

68040-

69427

II mi238

(39)

UGT74

C1

ac006

533

68169-

66271

II m283C

(61)

UGT85

A1

ac006

551

101428-

104183

I m235

(31)

UGT74

D1

ac006

533

89126-

86568

II m283C

(61)

UGT85

A2

ab016

819

32-1475 NA

UGT74

E1

ac007

153

84239-

82737

I m488 (5) UGT85

A3*

ac006

551

105703-

107200

I m235

(31)

25

UGT74

E2

ac007

153

86163-

84720

I m488 (5) UGT85

A4

ac013

430

74211-

75743

I g17311

(123)

UGT74

F1

ac002

333

16201-

14716

II m336

(79)

UGT86

A1

ac006

922

90229-

91891

II ve016

(67)

UGT74

F2

ac002

333

21603-

20167

II m336

(79)

UGT86

A2

ac005

851

61998-

64451

II B68

(50)

UGT75

B1

ac005

106

79156-

77747

I m488 (5) UGT87

A1

ac004

165

30982-

29518

II m283C

(61)

UGT75

B2

ac005

106

69936-

68569

I m488 (5) UGT87

A2

ac004

165

33440-

31949

II m283C

(61)

UGT75

C1

z9733

5

81696-

80326

IV mi279

(51)

UGT88

A1

ap000

373

58404-

56926

III mi289

(22)

UGT75

D1

u8129

3

26-1589 NA UGT89

A1P

ac006

085

40400-

41711

I m280

(85)

UGT75

D1

z9733

9

47946-

46523

IV mi260

(55)

UGT89

B1

ac016

662

88835-

86614

I EIL3

(113)

UGT76

B1*

ac008

153

1-684 III MS2

(18)

UGT90

A1

ac005

167

24146-

26230

II mi398

(29)

UGT76

C1

ab017

060

571-

2065

V ASA1

(18)

UGT90

A2

ac005

489

96717-

95228

I mi443

(9)

UGT76

C2*

ab005

237

86286-

97834

V ASA1

(18)

UGT90

A3P

ac005

167

27000-

29200

II mi398

(29)

UGT76

C3

ab017

060

7401-

9269

V ASA1

(18)

UGT91

A1

ac006

340

3513-

4925

II mi238

(39)

26

UGT76

C4

ab017

060

2536-

4403

V ASA1

(18)

UGT91

B1

ab026

639

16113-

14713

V g2368

(125)

UGT76

C5

ab017

060

5455-

6900

V ASA1

(18)

UGT91

C1

ab025

613

28355-

26973

V nga129

(105)

Table S1 (continued)

27

TABLE S2Corrections to the database annotations of UGT sequences

UGTs Corrections

73B1, 73B2, 73B3 three independent genes instead of a single gene

72B1, 75D1 no intron structures predicted

73B4, 74F1, 76E3, 90A2 different intron positions predicted

72E1, 75B1, 84B2, 87A1 different start codons predicted

74E1 a mistake in translation of the C-terminal

71B4P, 72D2P, 76E10P, 84B3P,

89A1P

interpreted as pseudogenes

Table S2

28