12
Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo Joyraj Bhattacharjee, 1 Chun-Ping Yu, 1 Jinn-Jy Lin, 1,2,3 Chen Siang Ng, 1,3 Tzi-Yuan Wang, 1 Hsin-Hung Lin, 1 and Wen-Hsiung Li* ,1,4,5 1 Biodiversity Research Center, Academia Sinica, Taipei, Taiwan 2 Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Sciences, Academia Sinica, Nankang, Taipei, Taiwan 3 Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, Taiwan 4 Center for the Integrative and Evolutionary Galliformes Genomics (iEGG Center), National Chung Hsing University, Taichung, Taiwan 5 Department of Ecology and Evolution, University of Chicago, Chicago, Illinois *Corresponding author: E-mail: [email protected]. Associate editor: Gunter Wagner The RNA-seq data used in this study are available at National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under Bioproject: PRJNA245063. Abstract Feathers, which are mainly composed of a- and b-keratins, are highly diversified, largely owing to duplication and diversification of b-keratin genes during bird evolution. However, little is known about the regulatory changes that contributed to the expressional diversification of b-keratin genes. To address this issue, we studied transcriptomes from five different parts of chicken contour and flight feathers. From these transcriptomes we inferred b-keratin enriched co- expression modules of genes and predicted transcription factors (TFs) of b-keratin genes. In total, we predicted 262 TF– target gene relationships in which 56 TFs regulate 91 b-keratin genes; we validated 14 of them by in vitro tests. A dual criterion of TF enrichment and “TF–target gene” expression correlation identified 26 TFs as the major regulators of b- keratin genes. According to our predictions, the ancestral scale and claw b-keratin genes have common and unique regulators, whereas most feather b-keratin genes show chromosome-wise regulation, distinct from scale and claw b- keratin genes. Thus, after expansion from the b-keratin gene on Chr7 to other chromosomes, which still shares a TF with scale and claw b-keratin genes, most feather b-keratin genes have recruited distinct or chromosome-specific regulators. Moreover, our data showed correlated gene expression profiles, positive or negative, between predicted TFs and their target genes over the five studied feather regions. Therefore, regulatory divergences among feather b-keratin genes have contributed to structural differences among different parts of feathers. Our study sheds light on how feather b-keratin genes have diverged in regulation from scale and claw b-keratin genes and among themselves. Key words: feather evolution, keratin genes, regulatory evolution, cis elements. Introduction Feather was an extraordinary innovation in dinosaurs but has become highly diversified in birds, evolving into various forms for protection, camouflage, heat retention, mate attraction and flight (Pettingill and Breckenridge 2012). Feather diversi- fication is presumed to be key to the success of modern birds, allowing them to adapt and radiate into various ecological niches. It is the most complex integumentary structure found in the avian lineage of vertebrates (Prum 1999; Prum and Brush 2002, 2003; Pabisch et al. 2010; Weiss et al. 2011) and a premier example of a complex evolutionary novelty. In vertebrates all epidermal appendages, including the feather, are formed mostly by alpha (a) and beta (b) keratin proteins (Alibardi and Toni 2008; Alibardi et al. 2009; Wu et al. 2015). Both types of keratin genes have been expanded by gene duplication and functional diversification during vertebrate evolution. The a-keratins are associated with hair, nail, horns, and hoofs (Vandebergh and Bossuyt 2012). The b-keratins emerged in reptiles and underwent expansion in chelonians and birds to form structures such as scale, claw and shell of turtles and scale, claw, beak and feather of birds. In particular, the avian lineage underwent many b-keratin gene duplications, contributing to feather diversification (Dalla Valle, Nardi, Belvedere, et al. 2007; Dalla Valle,Nardi, Toffolo, et al. 2007; Greenwold and Sawyer 2010, 2013; Li et al. 2013; ZhaNg et al. 2014). The b-keratins in birds are divided into scale, claw, kerati- nocyte, and feather based on sequence heterogeneity and tissue specific expression (Presland et al. 1989; Li et al. 2013; Greenwold et al. 2014). It has been hypothesized that feather b-keratins evolved from reptilian scale b-keratins by losing glycine and tyrosine rich tail-moieties (Regal 1975; Zhang and Zhou 2000; Greenwold and Sawyer 2010). This conferred feather structures more flexibility compared with scales and claws in birds. It is reasonable to assume from fossil Article Fast Track ß The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 33(11):2769–2780 doi:10.1093/molbev/msw165 Advance Access publication August 8, 2016 2769

Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

Regulatory Divergence among Beta-Keratin Genes during BirdEvolution

Maloyjo Joyraj Bhattacharjee1 Chun-Ping Yu1 Jinn-Jy Lin123 Chen Siang Ng13 Tzi-Yuan Wang1

Hsin-Hung Lin1 and Wen-Hsiung Li145

1Biodiversity Research Center Academia Sinica Taipei Taiwan2Bioinformatics Program Taiwan International Graduate Program Institute of Information Sciences Academia Sinica Nankang Taipei Taiwan3Institute of Molecular and Cellular Biology National Tsing Hua University Hsinchu Taiwan4Center for the Integrative and Evolutionary Galliformes Genomics (iEGG Center) National Chung Hsing University Taichung Taiwan5Department of Ecology and Evolution University of Chicago Chicago Illinois

Corresponding author E-mail whlisinicaedutw

Associate editor Gunter WagnerThe RNA-seq data used in this study are available at National Center for Biotechnology Information (NCBI) Sequence Read Archive(SRA) under Bioproject PRJNA245063

Abstract

Feathers which are mainly composed of a- and b-keratins are highly diversified largely owing to duplication anddiversification of b-keratin genes during bird evolution However little is known about the regulatory changes thatcontributed to the expressional diversification of b-keratin genes To address this issue we studied transcriptomes fromfive different parts of chicken contour and flight feathers From these transcriptomes we inferred b-keratin enriched co-expression modules of genes and predicted transcription factors (TFs) of b-keratin genes In total we predicted 262 TFndashtarget gene relationships in which 56 TFs regulate 91 b-keratin genes we validated 14 of them by in vitro tests A dualcriterion of TF enrichment and ldquoTFndashtarget generdquo expression correlation identified 26 TFs as the major regulators of b-keratin genes According to our predictions the ancestral scale and claw b-keratin genes have common and uniqueregulators whereas most feather b-keratin genes show chromosome-wise regulation distinct from scale and claw b-keratin genes Thus after expansion from the b-keratin gene on Chr7 to other chromosomes which still shares a TF withscale and claw b-keratin genes most feather b-keratin genes have recruited distinct or chromosome-specific regulatorsMoreover our data showed correlated gene expression profiles positive or negative between predicted TFs and theirtarget genes over the five studied feather regions Therefore regulatory divergences among feather b-keratin genes havecontributed to structural differences among different parts of feathers Our study sheds light on how feather b-keratingenes have diverged in regulation from scale and claw b-keratin genes and among themselves

Key words feather evolution keratin genes regulatory evolution cis elements

IntroductionFeather was an extraordinary innovation in dinosaurs but hasbecome highly diversified in birds evolving into various formsfor protection camouflage heat retention mate attractionand flight (Pettingill and Breckenridge 2012) Feather diversi-fication is presumed to be key to the success of modern birdsallowing them to adapt and radiate into various ecologicalniches It is the most complex integumentary structure foundin the avian lineage of vertebrates (Prum 1999 Prum andBrush 2002 2003 Pabisch et al 2010 Weiss et al 2011) anda premier example of a complex evolutionary novelty

In vertebrates all epidermal appendages includingthe feather are formed mostly by alpha (a) and beta(b) keratin proteins (Alibardi and Toni 2008 Alibardiet al 2009 Wu et al 2015) Both types of keratin geneshave been expanded by gene duplication and functionaldiversification during vertebrate evolution The a-keratinsare associated with hair nail horns and hoofs

(Vandebergh and Bossuyt 2012) The b-keratins emergedin reptiles and underwent expansion in chelonians andbirds to form structures such as scale claw and shellof turtles and scale claw beak and feather of birds Inparticular the avian lineage underwent many b-keratingene duplications contributing to feather diversification(Dalla Valle Nardi Belvedere et al 2007 Dalla ValleNardiToffolo et al 2007 Greenwold and Sawyer 2010 2013 Liet al 2013 ZhaNg et al 2014)

The b-keratins in birds are divided into scale claw kerati-nocyte and feather based on sequence heterogeneity andtissue specific expression (Presland et al 1989 Li et al 2013Greenwold et al 2014) It has been hypothesized that featherb-keratins evolved from reptilian scale b-keratins by losingglycine and tyrosine rich tail-moieties (Regal 1975 Zhang andZhou 2000 Greenwold and Sawyer 2010) This conferredfeather structures more flexibility compared with scalesand claws in birds It is reasonable to assume from fossil

Article

FastT

rack

The Author 2016 Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution All rights reserved For permissions pleasee-mail journalspermissionsoupcom

Mol Biol Evol 33(11)2769ndash2780 doi101093molbevmsw165 Advance Access publication August 8 2016 2769

discoveries that some ancient types of feather b-keratin geneswere already present in feathered dinosaurs (Xu et al 20122015) Later more complex feather b-keratins genes evolvedin birds leading to feather diversification for various functionssuch as thermoregulation and flight Greenwold et al (2014)studied 48 bird genomes and found that in most bird speciesscale claw and keratinocyte b-keratin genes were localized inChr25 whereas most feather b-keratin genes were localized inChr25 and Chr27 along with few others in Chr1 Chr2 Chr6Chr7 and Chr10 Ng et al (2014) proposed that the diversityof feather is partly due to differential expression of featherb-keratin genes in different feather structures

While much research has been conducted on the duplica-tion and functional diversification of b-keratin genes little hasbeen done on the regulatory network that controls the ex-pression of different classes (claw scale feather and keratino-cyte) of b-keratin genes and on the differential expression offeather b-keratin genes in different feather parts

A transcription factor (TF) binds to specific DNA se-quences (TF binding sites abbreviated as TFBSs) in the pro-moter of a gene thereby controlling the expression timingand level of the gene TFndashTFBS pairs therefore are importantelements of a gene regulatory network The purpose of thisstudy is to identify the TFBSs in b-keratin genes and theircognate TFs and to understand the common and uniqueregulation of different classes of b-keratin genes For this pur-pose we shall reanalyze the transcriptomes obtained previ-ously from different parts of developing contour feather andflight feather of chicken (Ng et al 2014) These transcriptomesof feather development are excellent materials for inferringstrongly co-expressed genes with potentially common regu-latory basis and for predicting TFndashTFBS pairs (Yu et al 2015)

Results and Discussion

Gene Co-Expression ModulesSupplementary figure S1 Supplementary Material onlineshows the different parts of body contour feather andflight feather from which the transcriptomes (RNA-seqdata) were obtained previously (Ng et al 2014) The twoparts of contour feather were defined as early-body (EB)and late-body (LB) whereas the three parts of flight featherwere defined by early-flight (EF) middle-flight (MF) and late-flight (LF) Three biological replicates were obtained for eachof these five feather parts so that in total 15 transcriptomeswere obtained A gene is considered expressed if its FPKM(fragments per kilobase of exon per million mapped reads)is 1 in at least 1 of the 15 transcriptomes In total 11533genes including 130 b-keratin genes were expressed (supplementary table S1 Supplementary Material online) Interms of the sub-classification of b-keratins (Greenwoldet al 2014) the 130 expressed b-keratin genes which rep-resent 87 of the b-keratin genes annotated in Ng et al(2014) include 18 scale 11 claw 15 keratinocyte and 86feather b-keratin genes

For the expressed genes (11533) the Pearson correlationcoefficient (PCC) in expression level is high (PCCgt 097) be-tween the biological replicates except for LB (PCCfrac14 093

091 076) and EF (PCCfrac14 096 095 086) (supplementaryfig S2 Supplementary Material online above the diagonal)This implies that the LB and EF replicates are more hetero-geneous Between the conditions EB and EF have a high meanPCC (096 6 002) and so do MF and LF (mean PCCfrac14 097 6 001) By applying the criterion of PCC 08 in expres-sion level between b-keratin and other genes (supplementarytable S2 Supplementary Material online) we extracted fromthe 11533 expressed genes a subset of 2314 genes (supplementary table S3 Supplementary Material online) This sub-set includes 128 b-keratin genes and 2186 nonkeratin genes(only 2 expressed b-keratin genes did not satisfy PCC 08)The PCC pattern for the subset of 2314 genes (supplementary fig S2 Supplementary Material online below the diago-nal) is consistent with that for the 11533 expressed genes

By hierarchical clustering we classified the 2314 genes inthe above subset into 47 clusters (fig 1A see also supplementary fig S3 and table S4 Supplementary Material online)To increase the probability of co-expressed genes being co-regulated we added the condition that the genes in a clustershare the same Gene Ontology (GO) term so that they arelikely involved in the same biological function The GO func-tional enrichment analysis showed that 20 (ID 1 2 5 7 9 1721 23ndash25 27ndash36) of the 47 clusters where mostly b-keratingenes are present are significantly enriched in functionalbiological terms (fig 1B supplementary table S5Supplementary Material online) Note that the majority ofthe terms enriched are related to keratin-based cellular struc-tural processes such as cytoskeleton intermediate filamentand keratin (Kreplak et al 2004 Alibardi et al 2009 Herrmannet al 2009 Lee et al 2012) Furthermore 10 other clusters (ID3 4 6 8 11ndash14 16 19) which lack b-keratin genes alsoshowed enriched functional terms Therefore we considerthe 30 (20thorn 10) significant clusters as co-expressed modulessuitable for inferring regulatory elements and annotation ofregulatory network of genes particularly for b-keratin genes

Predicted TFndashTFBS Pairs of -Keratin Genes andInferred TFndashTarget Gene RelationshipsThe TFndashTFBS pairs of 496 TFs of chicken in the CIS-BPdatabase (cisbpccbrutorontoca last assessed 14 April2016) (Weirauch et al 2014) served as a valuable resourceto scan and select enriched TFBSs (P valuelt 00001FDRlt 000001) in each of the co-expression module (supplementary fig S4 Supplementary Material online) To in-crease the probability of a selected TF being true we addedthe condition that the gene coding for the TF recognizingthe motif and the gene in which the TFBS is overrepre-sented have a high PCC (PCCgt j08j) This conditionhelped us to identify the TFndashtarget gene relationship (sum-marized in fig 2A and detailed in supplementary table S6Supplementary Material online) Overall we identified13624 pTFndashtarget gene pairs (full data not shown) inwhich 213 TFs (pTFs) putatively regulate 1857 genes bypositive or negative interactions In terms of b-keratingenes we found 372 TFndashtarget gene pairs in which 81TFs putatively regulate 91 b-keratin genes which maybe further categorized as for scale 13 putative TFs (pTFs)

Bhattacharjee et al doi101093molbevmsw165 MBE

2770

regulate 17 target genes for claw 7 pTFs9 target genes forkeratinocytes 18 pTFs9 target genes and for feather 63pTFs56 target genes Interestingly feather b-keratins werefound regulated by many repressors The Venn diagram de-picts that each of the b-keratin sub-classes (scale claw ker-atinocytes and feather) have common as well as uniqueregulators (fig 2A) As the b-keratin genes expandedin birds by gene duplication the common and unique regu-lators found herein may be useful to unveil their regulatorytransition as discussed below

-Keratin Genes in Turkey and EvolutionaryConservation of Predicted TFBSWe further consider the condition that the TFBS present inthe promoter region of a gene has been conserved in thepromoter region of its orthologue in Turkey For this purposewe scanned the genome of Turkey (release 50) using chickenb-keratin genes as queries This approach identified 142 b-keratin genes in Turkey which includes 14 scale 12 claw 2keratinocyte and 114 feather b-keratin genes (supplementarytable S7 and fig S5 Supplementary Material online) Thuschicken and Turkey have similar numbers of b-keratin genesUnfortunately only 70 of the annotated Turkey b-keratingenes could be assessed for upstream 1000 bp and another10 genes for upstream 200-bp promoter sequences from theTurkey genome We found 192 predicted TFndashtarget gene

pairs (out of 372 pairs) satisfied the conservation criterion(supplementary table S6 and fig S5 SupplementaryMaterial online) Importantly the TFBS found in the pro-moter of a particular sub-class of b-keratin (scale claw ker-atinocyte and feather) in chicken was conserved on thepromoter of the corresponding sub-class in Turkey b-keratingenes This supported the predicted TFndashTFBS relationships ofb-keratin genes in chicken

Regulatory Transition of -Keratin GenesThe pTFndashtarget gene relationships of different sub-classes ofb-keratin genes are presented in figure 3A and sorted fromsupplementary tables S6ndashS8 Supplementary Material onlineWe also did maximum likelihood phylogenetic reconstruc-tion of b-keratins to understand the regulatory transitionassociated with the evolution of b-keratin genes (fig 3Band see supplementary fig S6 Supplementary Material onlinefor an expanded phylogeny) The phylogeny is concordantwith previous studies with scale and claw b-keratins being thebasal clade of feather b-keratins (Greenwold et al 2014 Zhaget al 2014)

Figure 3A shows that the basal scale and claw b-keratingenes have common as well as unique regulators SpecificTFs of homeodomain SMAD T-box AP-2 Sox nuclear re-ceptor and C2H2 ZF family uniquely regulate scale b-keratingenes (supplementary table S8 Supplementary Material

ClusterID

Enriched keratin related GO terms

Other enriched GO terms

-ker

atin

rel

ated

clu

ster

s

1 Yes Wnt signaling pathway developmental protein differentiation2 Yes Catalytic activity hydrolase activity mitochondrion protein stabilization5 No Endodermal cell differentiation ion transport signal transducer activity7 No Hydrolase activity acting on carbon-nitrogen bonds transmembrane signaling9 No Circadian rhythm response to endoplasmic reticulum stress chemotaxis

17 Yes NA21 Yes Microtubule-based process phosphatidylinositol-mediated signaling pigmentation23 Yes Protein stabilization hydrolase activity acting on carbon-nitrogen bond24 No Intracellular protein transport neuron apoptotic process cytoplasm25 Yes NA27 Yes NA28 Yes NA29 Yes NA30 Yes NA31 No Extracellular vesicular exosome fatty acid metabolic process translation32 Yes NA33 Yes NA34 Yes NA35 No Poly(A) RNA binding extracellular vesicular exosome36 Yes NA

Oth

er c

lust

ers

3

NA

Transmembrane transport sensory perception of pain4 Proteinaceous extracellular matrix signal transduction transaminase activity6 Membrane exocytosis synaptic vesicle G-protein coupled receptor activity8 Signal transducer activity immune system process locomotion11 Cell adhesion nucleolus poly(A) RNA binding heparin binding12 Protein dimerization activity organelle nucleus13 Potassium channel activity ion channel activity14 Immune response determination of leftright symmetry immune system process16 Calcium ion import phosphatidylinositol-mediated signaling19 Cofactor metabolic process cell death DNA helicase activity

LF1LF2

LF3EB1

EB2 MF2

MF3 MF1

EF3 EF1

EF2 EB3

LB1 LB2

LB3

1

2

3

4

5

6

7

8

9

11

12

13

14

16

17

192123

242527

2829

30

31

32

33

34

35

36b

A B

FIG 1 Hierarchical clustering and heatmap of gene-expression level and enriched functional categories in coexpression clusters (A) The clusteringof genes is based on expression similarity of the 2314 sorted genes in different feather parts The color bars on the left side denote the clustersobtained after cutting the hierarchical cluster tree at specific height (see supplementary fig S2 Supplementary Material online) The heatmaprepresents expression pattern of the gene in each cluster High expression is shown in red and low expression is shown in blue (see the color bar atthe left top) The entries below the heatmap denote transcriptome ID as given in figure S1 Supplementary Material online and are explained in thetext The clusters which have significantly enriched GO terms are numbered in the figure (B) The entries in the table represent the enrichedfunctions corresponding to the significant clusters The clusters are categorized as ldquob-keratin relatedrdquo and ldquoothersrdquo For keratin genes the child GOterms such as ldquofeatherrdquo ldquoscalerdquo ldquoclawrdquo ldquokeratinocytesrdquo and ldquointermediate filamentrdquo are assigned to its parental GO term ldquokeratinrdquo ldquoNArdquo denotesldquonot availablerdquo

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2771

online) Similarly specific TFs of the homeodomain familyuniquely regulate claw b-keratin genes A few TFs of home-odomain (ENSGALG00000023738 ENSGALG00000023195)and C2H2 ZF (ENSGALG00000016016 ENSGALG00000007357) commonly regulate scale and claw b-keratin genesAmong the above TFs two are repressors one being com-mon (ENSGALG00000007357 C2H2 ZF family) and theother (ENSGALG00000003759 nuclear receptor family)unique for scale As in Greenwold et al (2014) Chr7 featherb-keratin ENSGALG00000006133 (named as Chr7 in fig 3A)is basal of all feather b-keratins (fig 3B) Notably this gene is

regulated by two TFs of the homeodomain family(ENSGALG00000023738 ENSGALG00000023195) commonwith ancestral scale and claw b-keratin genesENSGALG00000023738 and ENSGALG00000023195 aremembers of the Iroquois (IRX-6) and Distal-less (DLX-6) ho-meobox 6 protein families respectively both having impor-tant roles in pattern formation limb development andappendage growth in vertebrates (Ogura et al 2001Velinov et al 2012) ENSGALG00000007357 andENSGALG00000003759 are B-cell lymphoma 6 (BCL6) andRAR-related orphan receptor A (RORALPHA1) protein

IRX6Iroquois-class homeodomain protein

91 Genes

56 TFs (42A 12R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

38 TFs (27A 11R)

Feather

9 Genes

16 TFs (14A 2R)

Keratinocyte

IRX6Iroquois-class homeodomain protein

91 Genes

81 TFs (51A 28R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

63 TFs (36A 27R)

Feather

9 Genes

18 TFs (16A 2R)

Keratinocyte

A B

FIG 2 Summary of annotated TFndashtarget gene relationships (A) Overall predictions and (B) predictions supported by differential expression ofpTFndashtarget gene The figure summarizes the number of annotated TFs and their corresponding numbers of target b-keratin genes The annotatedTFs are further classified according to the sub-class (scale claw keratinocytes and feather) of the b-keratin genes they regulate The entries ldquoArdquo andldquoRrdquo denotes ldquoactivatorrdquo and ldquorepressorrdquo respectively The Venn-diagram shows the numbers of common and unique regulators among the foursub-classes of b-keratins Iroquois-class homeodomain (IRX6) was found the sole TF protein which commonly regulates all the four sub-classes ofb-keratin genes

TFActivatorRepressor

ClawKeratinocyteScaleFeatherChr25 27

Chr2

Chr27

Chr10

Chr7

Enriched regulators in each cladeRed repressor Blue activator

Clade 3 Chr27ENSGALG00000008403 (AT hook)ENSGALG00000001577 (Forkhead) ENSGALG00000009774 (Ets)ENSGALG00000011980 (Grainyhead)ENSGALG00000019647 (C2H2 ZF)

Clade 4 Chr25ENSGALG00000002393 (AP-2) ENSGALG00000023607 (bHLH) ENSGALG00000000233 (C2H2 ZF) ENSGALG00000010101 (C2H2 ZF) ENSGALG00000012277 (bZIP)

Clade 7 ScaleENSGALG00000007357 (C2H2 ZF) ENSGALG00000000316 (HD) ENSGALG00000008250 (T-box)

Clade 6 ClawENSGALG00000023738 (HD) ENSGALG00000013342 (HD) ENSGALG00000009129 (HD)

Clade1 Chr2ENSGALG00000007056 (C2H2) ENSGALG00000011435 (Ets) ENSGALG00000016927 (C2H2) ENSGALG00000014696 (C2H2)

A B C

FIG 3 Regulatorndashtarget gene relationships phylogenetic tree of b-keratin genes and chromosomesub-class-wise enriched regulators of b-keratingenes (A) Regulators (TFrsquos) and their target genes A specific color is used to denote a sub-class of target genes (scale claw feather orkeratinocytes) TFs are denoted by red color and activators (gray arrows) and repressors (red T shapes) are shown in conventional notationsMost feather b-keratin genes have chromosome-wise specific regulators and only the Chr7 b-keratin gene shares a regulator with scale and claw b-keratin genes (B) Maximum likelihood phylogeny of the annotated b-keratin genes Scale and claw b-keratin genes are basal to feather b-keratingenes whereas the Chr7 b-keratin gene is basal to all feather b-keratin genes The clustering pattern of b-keratin genes showed consistency withthe regulation pattern (C) Enriched regulators of each clade of b-keratin genes identified by phylogenetic analysis The enrichment is based onfinding overrepresented regulators on each clade and is done to infer the major regulators of each clade that regulates multiple b-keratin genes

Bhattacharjee et al doi101093molbevmsw165 MBE

2772

respectively BCL6 has been reported to function as a repres-sor for germinal center formation and influences apoptosis(Huynh et al 2000) RORALPHA1 also has been reported as arepressor and is involved in organogenesis differentiationand circadian rhythm in vertebrates (Xiao et al 2015)This implies that the TFs taking part in structural frameworkof ancestral scale and claw as well as proto-feathers may havebeen recruited from related networks regulating pattern for-mation limb development appendage growth andorganogenesis

The remaining feather b-keratin genes showed consis-tent genomic organization clustering pattern and regu-lation distinct from the ancestral scale and claw b-keratingenes For example all Chr2 b-keratins are clustered to-gether as clade-1 and uniquely regulated by TFs of eightfamilies (Ets E2F C2H2 ZF nuclear receptor bZIP bHLHhomeodomain-POU and forkhead) The two biggest lociChr27 and 25 b-keratin genes formed two (clade-2 and 3)and one clade (clade-4) respectively Both Chr27 andChr25 b-keratin genes have common as well as uniqueregulators Specific TFs of six families (bHLH RFX homeo-domain C2H2 ZF bZIP and forkhead) commonly regulateChr25 and Chr27 b-keratin genes TFs of five families(bHLH homeodomain AP-2 E2F and C2H2 ZF) uniquelyregulate Chr25 b-keratin genes TFs of 15 families (ARIDBRIGHT Ets Rel C2H2 ZF bZIP grainyhead AT hookhomeodomain bHLH HSF TBP SMAD nuclear receptorDPE2F and forkhead) uniquely regulate Chr27 b-keratingenes Notably many of the Chr27 unique regulators arerepressors All Chr10 b-keratin genes were clustered asclade-5 although our approach annotated only oneChr10 b-keratin gene being regulated by specific TFs oftwo families (Ets and bHLH) Lastly keratinocyte b-keratingenes that together were clustered as clade-8 are foundhaving springy regulation common with scale claw andfeather b-keratin genes This is consistent with their di-verse roles related to scale claw and feather (Greenwoldet al 2014) It is important to note that the AnimalTranscription factor database annotated at least 858chicken genes as TFs (Zhang et al 2012) whereas theCIS-BP database (Weirauch et al 2014) has known or pre-dicted TFndashTFBS relationships for 496 TFs of chicken withpaucity of TFndashTFBS information for the remaining TFsFortunately the TFndashtarget gene relationships we have ob-tained include representatives of all genomic loci (exceptone feather b-keratin of Chr1 and one jun-transformed b-keratin of Chr6) and sub-classes for understanding thecharacteristic regulation of b-keratin genes

The typical regulation of b-keratin genes observed abovecan be explained by the following innovative regulatory tran-sition events First the transition from scale to claw or viceversa involved both retention and gainloss of TFBSsThe motifs corresponding to the common TFs are the casesof retention notably the common repressor motif forENSGALG00000007357 (C2H2 ZF family) might be importantto repress expression of scale and claw b-keratin genes infeather The unique repressor motif of scale b-keratins corre-sponding to ENSGALG00000003759 (nuclear receptor family)

might be a gain or loss event depending on whether the scaleor claw b-keratins were ancestral and might repress scale b-keratin gene expression in claw Second the Chr7 b-keratingene which is the first locus where feather b-keratin evolvedfrom a scale or claw b-keratin gene retains the ancestralcis motifs for ENSGALG00000023738 and ENSGALG00000023195 Third most of the feather b-keratin geneswere expanded from the basal Chr7 feather b-keratin geneand lost motifs recognized by ancestral scale and claw TFs andgained new motifs to perform new function In summaryfeather b-keratin genes underwent chromosome-wise expan-sions and recruited chromosome-wise regulators Similar tothe evolution of claw and scale b-keratin genes the transitionof feather b-keratin genes from Chr25 to Chr27 or the reverseinvolved both retention and gainloss of TFBSs as theyshowed common and unique regulation The characteristicregulatory control of feather b-keratin genes observed heremay be essential to effect their mosaic expression in diversefeather parts as has been suggested in a previous study (Nget al 2014)

Major Regulators of -Keratin GenesBy a major regulator we mean a TF that regulates multipleb-keratin genes Following the annotation of b-keratin regu-lators we inferred the major regulators of b-keratin genes intwo ways

Major TFs Predicted by Enrichment AnalysisFirst we identified over-represented regulators in eachclade of b-keratin genes (fig 3C also see supplementarytable S9 Supplementary Material online) and found thatthe clades of Chr2 Chr27 Chr25 claw and scale b-keratingenes have significantly enriched TFs (P valuelt 0001FDRlt 00001) clades 2 (having few Chr27 b-keratins)and 5 (having few Chr10 b-keratins) being very smallshowed no over-represented TFs Clade 8 in which all ker-atinocyte b-keratin genes are clustered also lacks over-represented TFs as keratinocyte b-keratin genes are regu-lated in small clusters together with scale claw and featherb-keratin genes In details among the 15 annotated TFsthat regulate Chr2 b-keratin genes (supplementary tableS8 Supplementary Material online) 4 specific TFs (all acti-vators) of 2 families (C2H2 and Ets) were over-representedin the Chr2 clade Similarly five specific TFs (four repressorsand one activator) of five families (AT hook forkhead Etsgrainyhead and C2H2 ZF) and five specific TFs (all activa-tors) of four families (AP-2 bHLH C2H2 ZF and bZIP) wereover-represented in the Chr27 and Chr25 clades respec-tively As mentioned earlier Chr25 and Chr27 b-keratingenes have common as well as unique TFs and we foundthat the five over-represented TFs of the Chr25 clade maybe categorized as 2 unique (ENSGALG00000023607 andENSGALG00000002393) and 3 (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) com-mon with Chr27 b-keratin genes This implies that thethree common over-represented TFs have more targetson Chr25 and fewer on Chr27 and therefore were notover-represented in the Chr27 clade Similarly the five

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2773

over-represented TFs of the Chr27 clade may be catego-rized as three unique (ENSGALG00000009774 ENSGALG00000011980 and ENSGALG00000008403) and twocommon (ENSGALG00000019647 and ENSGALG00000001577) with Chr25 This implies that the two over-represented TFs of the Chr27 clade common with Chr25mostly regulate Chr27 b-keratin genes along with muchfewer Chr25 b-keratin genes and therefore were notover-represented in the Chr25 clade Three TFs (all activa-tors) of one family (HD) and three TFs (one repressor andtwo activators) of three families (C2H2 ZF HD and T-box)were over-represented in the claw and scale b-keratinclades respectively Among them the repressor TF(ENSGALG00000007357) commonly regulates scale andclaw b-keratin genes As the repressor TF was over-represented in the scale clade it clearly implied that therepressors have more targets in scale b-keratin genes thanin claw b-keratin genes Overall the enrichment analysisprovided a set of major TFs narrowing down the relativelylarge network (supplementary table S8 SupplementaryMaterial online) to a few TFs (fig 3C)

Major TFs Predicted by Differential Expression of -Keratin Genes and Their pTFsWe conducted differential expression analysis of both b-ker-atin genes and their pTF genes (fig 4 also see supplementarytables S10 and S11 Supplementary Material online) to (1)cross check overall predicted TFndashtarget gene relationship (2)find additional major TFs regulating a small cluster of b-ker-atin genes of two clades or within a clade skipped by enrich-ment analysis and (3) understand the structural diversity offeather beyond the previous study (Ng et al 2014) We foundthat among the five different feather structures (EB LB EF LFand MF) Chr2 b-keratin genes were up-regulated in LFand MF (log fold change (FC) 259 Plt 002) due to theup-regulation of 12 of the 15 pTFs of Chr2 (including 3 of the4 over-represented TFs of the Chr2 clade) Similarly Chr25b-keratin genes were up-regulated in EB and LB (logFC 166 Plt 002) and Chr27 b-keratin genes were down-regulatedin LF (logFC258 Plt 002) due to up-regulation of somepTFs of Chr25 and down-regulation of some pTFs of Chr27 b-keratin genes respectively (4 of 9 unique pTF of Chr25 15 of25 unique pTF of Chr27 and 3 of 10 common pTFs of Chr25

FIG 4 Heatmap showing dynamic expression of b-keratin genes andtheir putative regulators The heatmap is based on an average expres-sion level (normalized) of the genes under each condition Low tohigh expression is shown in a gradient scale from blue to red (see thecolor bar at top) The entries on the top of the heatmap denote the

FIG 4 Continuedconditions from which the transcriptomes were obtained as detailedin figure S1 Supplementary Material online The entries in top middlethat interrupted the heatmap denote bndashkeratin genes on a chromo-some or in a subclass The extreme right entries denote bndashkeratins(green) corresponding to each chromosome or subclass (shown intop middle) and their corresponding TFs (black) As the expressiondifferences between TF and target genes differ in magnitude we usedindependent scales for them for better resolution of synchronousexpression TFs that regulate b-keratin genes of a particular subclass(scale claw or feather) along with few keratinocytes are consideredunder the particular subclass About 25 of the 81 pTFs did not showdifferential expression and are marked by ldquordquo symbol The over-rep-resented TFs are indicated by ldquordquo

Bhattacharjee et al doi101093molbevmsw165 MBE

2774

and Chr27 were not differentially expressed) While a fewof the Chr25 and Chr27 differentially expressed pTFs areunique (supplementary tables S8 and S11 SupplementaryMaterial online) others are common The differentiallyexpressed pTFs that commonly regulate Chr25 and Chr27b-keratin genes include three (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) of thefive over-represented TFs of the Chr25 clade but none ofthe Chr27 clade The unique differentially expressed TFs ofthe Chr25 clade include two over-represented TFs(ENSGALG00000023607 and ENSGALG00000002393) of theChr25 clade along with three other activators(ENSGALG00000003404 ENSGALG00000011055 andENSGALG00000023420) whereas the differentially expressedunique TFs of Chr27 b-keratin genes are mostly repressorsincluding two (ENSGALG00000009774 ENSGALG00000011980) of the four over-represented repressor TFs ofChr27 b-keratin genes This implies that for Chr25 b-keratingenes two unique over-represented TFs along with otherunique activators TFs effect up-regulation in EB and LB ForChr27 b-keratin genes two unique over-represented repres-sor TFs together with other unique repressor TFs effect down-regulation of b-keratin genes in LF which is the barbs andbarbule-less rachis portion of feather Interestingly the Chr7b-keratin gene together with scale claw and few keratinocyteb-keratin genes showed up-regulation in EB and EF (Chr7logFC 248 claw logFC 187 scale logFC 137 keratino-cytes logFC 108 Plt 002) relative to those in LB MF andLF Most pTFs of the scale claw keratinocyte and Chr7 b-

keratin genes including enriched regulators of claw and scaleb-keratin genes differentially expressed with their targetgenes

Differential expression of a pTF gene with the target genesupported 262 (out of total 372) predicted TFndashtarget generelationships involving 56 of the 81 pTFs (figs 2B and 4)Thus compared with the overall prediction (fig 2A) wefound for scale 13 putative TFs (pTFs)regulate 17 targetgenes for claw 7 pTFs9 target genes for keratinocyte 16pTFs9 target genes and for feather 38 pTFs56 targetgenes Twenty-five TFs (9 activators and 16 repressors)that mostly regulate feather b-keratin genes althoughthey satisfied PCCgt j08j with their target gene did notsatisfy differential gene expression with their target geneBased on this robust rule we identified 56 TFs that regulate91 b-keratin genes We found that b-keratin genes on achromosome or in a sub-class that formed clades haveover-represented TFs Differential expression analysis fur-ther supported 16 of the 20 predicted over-representedTFs and provided additional 10 major TFs that either reg-ulate two cladessub-classes or a sub-clade of b-keratingenes within a clade skipped by enrichment analysis Themajor TFs predicted by enrichment analysis (for clades) andfurther supported by differential analysis are summarized intable 1 which includes 22 activators but only four repres-sors The major regulators predicted herein are describedwith their unique and common tentative roles and areimportant candidates for functional characterization of b-keratin genes

Table 1 Inferred major regulators of b-keratin genes with putative roles The listed TFs are the key regulators of b-keratin genes predicted by acombined approach of clade-wise enrichment and synchronous differential expression of TF-target genes

TF gene ID TF Family Activatorrepressor

Enriched Diffexpressed

No oftarget genes

Tentative role

ENSGALG00000007056 C2H2 ZF activator Yes Yes 5 Uniquely regulates Chr2 b-keratinsENSGALG00000011435 Ets activator Yes Yes 4ENSGALG00000016927 C2H2 ZF activator Yes Yes 4ENSGALG00000009774 Ets repressor Yes Yes 15 Uniquely regulates Chr27 b-keratinsENSGALG00000011980 Grainyhead repressor Yes Yes 12ENSGALG00000000233 C2H2 ZF activator Yes Yes 17 Commonly regulates Chr25 and Chr27 b-keratinsENSGALG00000010101 C2H2 ZF activator Yes Yes 21ENSGALG00000012277 bZIP activator Yes Yes 17ENSGALG00000008014 bZIP activator No Yes 13ENSGALG00000006210 bHLH activator No Yes 4ENSGALG00000011053 Homeodomain activator No Yes 10ENSGALG00000000836 RFX activator No Yes 6ENSGALG00000023607 bHLH activator Yes Yes 10 Uniquely regulates Chr25 b-keratinsENSGALG00000002393 AP-2 activator Yes Yes 13ENSGALG00000003404 activator No Yes 3ENSGALG00000011055 Homeodomain activator No Yes 3ENSGALG00000001143 Ets activator No Yes 1 Uniquely regulates Chr10 b-keratinsENSGALG00000023742 bHLH activator No Yes 1ENSGALG00000023738 Homeodomain activator Yes Yes 9 Commonly regulates scale claw Chr7 b-keratinsENSGALG00000023195 Homeodomain activator No Yes 6ENSGALG00000013342 Homeodomain activator Yes Yes 6 Uniquely regulates claw b-keratinsENSGALG00000009129 Homeodomain activator Yes Yes 4ENSGALG00000007357 C2H2 ZF repressor Yes Yes 10 Commonly regulates scale and claw b-keratinsENSGALG00000000316 Homeodomain activator Yes Yes 5 Uniquely regulates scale b-keratinsENSGALG00000008250 T-box activator Yes Yes 6ENSGALG00000003759 Nuclear receptor repressor No Yes 3

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2775

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 2: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

discoveries that some ancient types of feather b-keratin geneswere already present in feathered dinosaurs (Xu et al 20122015) Later more complex feather b-keratins genes evolvedin birds leading to feather diversification for various functionssuch as thermoregulation and flight Greenwold et al (2014)studied 48 bird genomes and found that in most bird speciesscale claw and keratinocyte b-keratin genes were localized inChr25 whereas most feather b-keratin genes were localized inChr25 and Chr27 along with few others in Chr1 Chr2 Chr6Chr7 and Chr10 Ng et al (2014) proposed that the diversityof feather is partly due to differential expression of featherb-keratin genes in different feather structures

While much research has been conducted on the duplica-tion and functional diversification of b-keratin genes little hasbeen done on the regulatory network that controls the ex-pression of different classes (claw scale feather and keratino-cyte) of b-keratin genes and on the differential expression offeather b-keratin genes in different feather parts

A transcription factor (TF) binds to specific DNA se-quences (TF binding sites abbreviated as TFBSs) in the pro-moter of a gene thereby controlling the expression timingand level of the gene TFndashTFBS pairs therefore are importantelements of a gene regulatory network The purpose of thisstudy is to identify the TFBSs in b-keratin genes and theircognate TFs and to understand the common and uniqueregulation of different classes of b-keratin genes For this pur-pose we shall reanalyze the transcriptomes obtained previ-ously from different parts of developing contour feather andflight feather of chicken (Ng et al 2014) These transcriptomesof feather development are excellent materials for inferringstrongly co-expressed genes with potentially common regu-latory basis and for predicting TFndashTFBS pairs (Yu et al 2015)

Results and Discussion

Gene Co-Expression ModulesSupplementary figure S1 Supplementary Material onlineshows the different parts of body contour feather andflight feather from which the transcriptomes (RNA-seqdata) were obtained previously (Ng et al 2014) The twoparts of contour feather were defined as early-body (EB)and late-body (LB) whereas the three parts of flight featherwere defined by early-flight (EF) middle-flight (MF) and late-flight (LF) Three biological replicates were obtained for eachof these five feather parts so that in total 15 transcriptomeswere obtained A gene is considered expressed if its FPKM(fragments per kilobase of exon per million mapped reads)is 1 in at least 1 of the 15 transcriptomes In total 11533genes including 130 b-keratin genes were expressed (supplementary table S1 Supplementary Material online) Interms of the sub-classification of b-keratins (Greenwoldet al 2014) the 130 expressed b-keratin genes which rep-resent 87 of the b-keratin genes annotated in Ng et al(2014) include 18 scale 11 claw 15 keratinocyte and 86feather b-keratin genes

For the expressed genes (11533) the Pearson correlationcoefficient (PCC) in expression level is high (PCCgt 097) be-tween the biological replicates except for LB (PCCfrac14 093

091 076) and EF (PCCfrac14 096 095 086) (supplementaryfig S2 Supplementary Material online above the diagonal)This implies that the LB and EF replicates are more hetero-geneous Between the conditions EB and EF have a high meanPCC (096 6 002) and so do MF and LF (mean PCCfrac14 097 6 001) By applying the criterion of PCC 08 in expres-sion level between b-keratin and other genes (supplementarytable S2 Supplementary Material online) we extracted fromthe 11533 expressed genes a subset of 2314 genes (supplementary table S3 Supplementary Material online) This sub-set includes 128 b-keratin genes and 2186 nonkeratin genes(only 2 expressed b-keratin genes did not satisfy PCC 08)The PCC pattern for the subset of 2314 genes (supplementary fig S2 Supplementary Material online below the diago-nal) is consistent with that for the 11533 expressed genes

By hierarchical clustering we classified the 2314 genes inthe above subset into 47 clusters (fig 1A see also supplementary fig S3 and table S4 Supplementary Material online)To increase the probability of co-expressed genes being co-regulated we added the condition that the genes in a clustershare the same Gene Ontology (GO) term so that they arelikely involved in the same biological function The GO func-tional enrichment analysis showed that 20 (ID 1 2 5 7 9 1721 23ndash25 27ndash36) of the 47 clusters where mostly b-keratingenes are present are significantly enriched in functionalbiological terms (fig 1B supplementary table S5Supplementary Material online) Note that the majority ofthe terms enriched are related to keratin-based cellular struc-tural processes such as cytoskeleton intermediate filamentand keratin (Kreplak et al 2004 Alibardi et al 2009 Herrmannet al 2009 Lee et al 2012) Furthermore 10 other clusters (ID3 4 6 8 11ndash14 16 19) which lack b-keratin genes alsoshowed enriched functional terms Therefore we considerthe 30 (20thorn 10) significant clusters as co-expressed modulessuitable for inferring regulatory elements and annotation ofregulatory network of genes particularly for b-keratin genes

Predicted TFndashTFBS Pairs of -Keratin Genes andInferred TFndashTarget Gene RelationshipsThe TFndashTFBS pairs of 496 TFs of chicken in the CIS-BPdatabase (cisbpccbrutorontoca last assessed 14 April2016) (Weirauch et al 2014) served as a valuable resourceto scan and select enriched TFBSs (P valuelt 00001FDRlt 000001) in each of the co-expression module (supplementary fig S4 Supplementary Material online) To in-crease the probability of a selected TF being true we addedthe condition that the gene coding for the TF recognizingthe motif and the gene in which the TFBS is overrepre-sented have a high PCC (PCCgt j08j) This conditionhelped us to identify the TFndashtarget gene relationship (sum-marized in fig 2A and detailed in supplementary table S6Supplementary Material online) Overall we identified13624 pTFndashtarget gene pairs (full data not shown) inwhich 213 TFs (pTFs) putatively regulate 1857 genes bypositive or negative interactions In terms of b-keratingenes we found 372 TFndashtarget gene pairs in which 81TFs putatively regulate 91 b-keratin genes which maybe further categorized as for scale 13 putative TFs (pTFs)

Bhattacharjee et al doi101093molbevmsw165 MBE

2770

regulate 17 target genes for claw 7 pTFs9 target genes forkeratinocytes 18 pTFs9 target genes and for feather 63pTFs56 target genes Interestingly feather b-keratins werefound regulated by many repressors The Venn diagram de-picts that each of the b-keratin sub-classes (scale claw ker-atinocytes and feather) have common as well as uniqueregulators (fig 2A) As the b-keratin genes expandedin birds by gene duplication the common and unique regu-lators found herein may be useful to unveil their regulatorytransition as discussed below

-Keratin Genes in Turkey and EvolutionaryConservation of Predicted TFBSWe further consider the condition that the TFBS present inthe promoter region of a gene has been conserved in thepromoter region of its orthologue in Turkey For this purposewe scanned the genome of Turkey (release 50) using chickenb-keratin genes as queries This approach identified 142 b-keratin genes in Turkey which includes 14 scale 12 claw 2keratinocyte and 114 feather b-keratin genes (supplementarytable S7 and fig S5 Supplementary Material online) Thuschicken and Turkey have similar numbers of b-keratin genesUnfortunately only 70 of the annotated Turkey b-keratingenes could be assessed for upstream 1000 bp and another10 genes for upstream 200-bp promoter sequences from theTurkey genome We found 192 predicted TFndashtarget gene

pairs (out of 372 pairs) satisfied the conservation criterion(supplementary table S6 and fig S5 SupplementaryMaterial online) Importantly the TFBS found in the pro-moter of a particular sub-class of b-keratin (scale claw ker-atinocyte and feather) in chicken was conserved on thepromoter of the corresponding sub-class in Turkey b-keratingenes This supported the predicted TFndashTFBS relationships ofb-keratin genes in chicken

Regulatory Transition of -Keratin GenesThe pTFndashtarget gene relationships of different sub-classes ofb-keratin genes are presented in figure 3A and sorted fromsupplementary tables S6ndashS8 Supplementary Material onlineWe also did maximum likelihood phylogenetic reconstruc-tion of b-keratins to understand the regulatory transitionassociated with the evolution of b-keratin genes (fig 3Band see supplementary fig S6 Supplementary Material onlinefor an expanded phylogeny) The phylogeny is concordantwith previous studies with scale and claw b-keratins being thebasal clade of feather b-keratins (Greenwold et al 2014 Zhaget al 2014)

Figure 3A shows that the basal scale and claw b-keratingenes have common as well as unique regulators SpecificTFs of homeodomain SMAD T-box AP-2 Sox nuclear re-ceptor and C2H2 ZF family uniquely regulate scale b-keratingenes (supplementary table S8 Supplementary Material

ClusterID

Enriched keratin related GO terms

Other enriched GO terms

-ker

atin

rel

ated

clu

ster

s

1 Yes Wnt signaling pathway developmental protein differentiation2 Yes Catalytic activity hydrolase activity mitochondrion protein stabilization5 No Endodermal cell differentiation ion transport signal transducer activity7 No Hydrolase activity acting on carbon-nitrogen bonds transmembrane signaling9 No Circadian rhythm response to endoplasmic reticulum stress chemotaxis

17 Yes NA21 Yes Microtubule-based process phosphatidylinositol-mediated signaling pigmentation23 Yes Protein stabilization hydrolase activity acting on carbon-nitrogen bond24 No Intracellular protein transport neuron apoptotic process cytoplasm25 Yes NA27 Yes NA28 Yes NA29 Yes NA30 Yes NA31 No Extracellular vesicular exosome fatty acid metabolic process translation32 Yes NA33 Yes NA34 Yes NA35 No Poly(A) RNA binding extracellular vesicular exosome36 Yes NA

Oth

er c

lust

ers

3

NA

Transmembrane transport sensory perception of pain4 Proteinaceous extracellular matrix signal transduction transaminase activity6 Membrane exocytosis synaptic vesicle G-protein coupled receptor activity8 Signal transducer activity immune system process locomotion11 Cell adhesion nucleolus poly(A) RNA binding heparin binding12 Protein dimerization activity organelle nucleus13 Potassium channel activity ion channel activity14 Immune response determination of leftright symmetry immune system process16 Calcium ion import phosphatidylinositol-mediated signaling19 Cofactor metabolic process cell death DNA helicase activity

LF1LF2

LF3EB1

EB2 MF2

MF3 MF1

EF3 EF1

EF2 EB3

LB1 LB2

LB3

1

2

3

4

5

6

7

8

9

11

12

13

14

16

17

192123

242527

2829

30

31

32

33

34

35

36b

A B

FIG 1 Hierarchical clustering and heatmap of gene-expression level and enriched functional categories in coexpression clusters (A) The clusteringof genes is based on expression similarity of the 2314 sorted genes in different feather parts The color bars on the left side denote the clustersobtained after cutting the hierarchical cluster tree at specific height (see supplementary fig S2 Supplementary Material online) The heatmaprepresents expression pattern of the gene in each cluster High expression is shown in red and low expression is shown in blue (see the color bar atthe left top) The entries below the heatmap denote transcriptome ID as given in figure S1 Supplementary Material online and are explained in thetext The clusters which have significantly enriched GO terms are numbered in the figure (B) The entries in the table represent the enrichedfunctions corresponding to the significant clusters The clusters are categorized as ldquob-keratin relatedrdquo and ldquoothersrdquo For keratin genes the child GOterms such as ldquofeatherrdquo ldquoscalerdquo ldquoclawrdquo ldquokeratinocytesrdquo and ldquointermediate filamentrdquo are assigned to its parental GO term ldquokeratinrdquo ldquoNArdquo denotesldquonot availablerdquo

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2771

online) Similarly specific TFs of the homeodomain familyuniquely regulate claw b-keratin genes A few TFs of home-odomain (ENSGALG00000023738 ENSGALG00000023195)and C2H2 ZF (ENSGALG00000016016 ENSGALG00000007357) commonly regulate scale and claw b-keratin genesAmong the above TFs two are repressors one being com-mon (ENSGALG00000007357 C2H2 ZF family) and theother (ENSGALG00000003759 nuclear receptor family)unique for scale As in Greenwold et al (2014) Chr7 featherb-keratin ENSGALG00000006133 (named as Chr7 in fig 3A)is basal of all feather b-keratins (fig 3B) Notably this gene is

regulated by two TFs of the homeodomain family(ENSGALG00000023738 ENSGALG00000023195) commonwith ancestral scale and claw b-keratin genesENSGALG00000023738 and ENSGALG00000023195 aremembers of the Iroquois (IRX-6) and Distal-less (DLX-6) ho-meobox 6 protein families respectively both having impor-tant roles in pattern formation limb development andappendage growth in vertebrates (Ogura et al 2001Velinov et al 2012) ENSGALG00000007357 andENSGALG00000003759 are B-cell lymphoma 6 (BCL6) andRAR-related orphan receptor A (RORALPHA1) protein

IRX6Iroquois-class homeodomain protein

91 Genes

56 TFs (42A 12R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

38 TFs (27A 11R)

Feather

9 Genes

16 TFs (14A 2R)

Keratinocyte

IRX6Iroquois-class homeodomain protein

91 Genes

81 TFs (51A 28R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

63 TFs (36A 27R)

Feather

9 Genes

18 TFs (16A 2R)

Keratinocyte

A B

FIG 2 Summary of annotated TFndashtarget gene relationships (A) Overall predictions and (B) predictions supported by differential expression ofpTFndashtarget gene The figure summarizes the number of annotated TFs and their corresponding numbers of target b-keratin genes The annotatedTFs are further classified according to the sub-class (scale claw keratinocytes and feather) of the b-keratin genes they regulate The entries ldquoArdquo andldquoRrdquo denotes ldquoactivatorrdquo and ldquorepressorrdquo respectively The Venn-diagram shows the numbers of common and unique regulators among the foursub-classes of b-keratins Iroquois-class homeodomain (IRX6) was found the sole TF protein which commonly regulates all the four sub-classes ofb-keratin genes

TFActivatorRepressor

ClawKeratinocyteScaleFeatherChr25 27

Chr2

Chr27

Chr10

Chr7

Enriched regulators in each cladeRed repressor Blue activator

Clade 3 Chr27ENSGALG00000008403 (AT hook)ENSGALG00000001577 (Forkhead) ENSGALG00000009774 (Ets)ENSGALG00000011980 (Grainyhead)ENSGALG00000019647 (C2H2 ZF)

Clade 4 Chr25ENSGALG00000002393 (AP-2) ENSGALG00000023607 (bHLH) ENSGALG00000000233 (C2H2 ZF) ENSGALG00000010101 (C2H2 ZF) ENSGALG00000012277 (bZIP)

Clade 7 ScaleENSGALG00000007357 (C2H2 ZF) ENSGALG00000000316 (HD) ENSGALG00000008250 (T-box)

Clade 6 ClawENSGALG00000023738 (HD) ENSGALG00000013342 (HD) ENSGALG00000009129 (HD)

Clade1 Chr2ENSGALG00000007056 (C2H2) ENSGALG00000011435 (Ets) ENSGALG00000016927 (C2H2) ENSGALG00000014696 (C2H2)

A B C

FIG 3 Regulatorndashtarget gene relationships phylogenetic tree of b-keratin genes and chromosomesub-class-wise enriched regulators of b-keratingenes (A) Regulators (TFrsquos) and their target genes A specific color is used to denote a sub-class of target genes (scale claw feather orkeratinocytes) TFs are denoted by red color and activators (gray arrows) and repressors (red T shapes) are shown in conventional notationsMost feather b-keratin genes have chromosome-wise specific regulators and only the Chr7 b-keratin gene shares a regulator with scale and claw b-keratin genes (B) Maximum likelihood phylogeny of the annotated b-keratin genes Scale and claw b-keratin genes are basal to feather b-keratingenes whereas the Chr7 b-keratin gene is basal to all feather b-keratin genes The clustering pattern of b-keratin genes showed consistency withthe regulation pattern (C) Enriched regulators of each clade of b-keratin genes identified by phylogenetic analysis The enrichment is based onfinding overrepresented regulators on each clade and is done to infer the major regulators of each clade that regulates multiple b-keratin genes

Bhattacharjee et al doi101093molbevmsw165 MBE

2772

respectively BCL6 has been reported to function as a repres-sor for germinal center formation and influences apoptosis(Huynh et al 2000) RORALPHA1 also has been reported as arepressor and is involved in organogenesis differentiationand circadian rhythm in vertebrates (Xiao et al 2015)This implies that the TFs taking part in structural frameworkof ancestral scale and claw as well as proto-feathers may havebeen recruited from related networks regulating pattern for-mation limb development appendage growth andorganogenesis

The remaining feather b-keratin genes showed consis-tent genomic organization clustering pattern and regu-lation distinct from the ancestral scale and claw b-keratingenes For example all Chr2 b-keratins are clustered to-gether as clade-1 and uniquely regulated by TFs of eightfamilies (Ets E2F C2H2 ZF nuclear receptor bZIP bHLHhomeodomain-POU and forkhead) The two biggest lociChr27 and 25 b-keratin genes formed two (clade-2 and 3)and one clade (clade-4) respectively Both Chr27 andChr25 b-keratin genes have common as well as uniqueregulators Specific TFs of six families (bHLH RFX homeo-domain C2H2 ZF bZIP and forkhead) commonly regulateChr25 and Chr27 b-keratin genes TFs of five families(bHLH homeodomain AP-2 E2F and C2H2 ZF) uniquelyregulate Chr25 b-keratin genes TFs of 15 families (ARIDBRIGHT Ets Rel C2H2 ZF bZIP grainyhead AT hookhomeodomain bHLH HSF TBP SMAD nuclear receptorDPE2F and forkhead) uniquely regulate Chr27 b-keratingenes Notably many of the Chr27 unique regulators arerepressors All Chr10 b-keratin genes were clustered asclade-5 although our approach annotated only oneChr10 b-keratin gene being regulated by specific TFs oftwo families (Ets and bHLH) Lastly keratinocyte b-keratingenes that together were clustered as clade-8 are foundhaving springy regulation common with scale claw andfeather b-keratin genes This is consistent with their di-verse roles related to scale claw and feather (Greenwoldet al 2014) It is important to note that the AnimalTranscription factor database annotated at least 858chicken genes as TFs (Zhang et al 2012) whereas theCIS-BP database (Weirauch et al 2014) has known or pre-dicted TFndashTFBS relationships for 496 TFs of chicken withpaucity of TFndashTFBS information for the remaining TFsFortunately the TFndashtarget gene relationships we have ob-tained include representatives of all genomic loci (exceptone feather b-keratin of Chr1 and one jun-transformed b-keratin of Chr6) and sub-classes for understanding thecharacteristic regulation of b-keratin genes

The typical regulation of b-keratin genes observed abovecan be explained by the following innovative regulatory tran-sition events First the transition from scale to claw or viceversa involved both retention and gainloss of TFBSsThe motifs corresponding to the common TFs are the casesof retention notably the common repressor motif forENSGALG00000007357 (C2H2 ZF family) might be importantto repress expression of scale and claw b-keratin genes infeather The unique repressor motif of scale b-keratins corre-sponding to ENSGALG00000003759 (nuclear receptor family)

might be a gain or loss event depending on whether the scaleor claw b-keratins were ancestral and might repress scale b-keratin gene expression in claw Second the Chr7 b-keratingene which is the first locus where feather b-keratin evolvedfrom a scale or claw b-keratin gene retains the ancestralcis motifs for ENSGALG00000023738 and ENSGALG00000023195 Third most of the feather b-keratin geneswere expanded from the basal Chr7 feather b-keratin geneand lost motifs recognized by ancestral scale and claw TFs andgained new motifs to perform new function In summaryfeather b-keratin genes underwent chromosome-wise expan-sions and recruited chromosome-wise regulators Similar tothe evolution of claw and scale b-keratin genes the transitionof feather b-keratin genes from Chr25 to Chr27 or the reverseinvolved both retention and gainloss of TFBSs as theyshowed common and unique regulation The characteristicregulatory control of feather b-keratin genes observed heremay be essential to effect their mosaic expression in diversefeather parts as has been suggested in a previous study (Nget al 2014)

Major Regulators of -Keratin GenesBy a major regulator we mean a TF that regulates multipleb-keratin genes Following the annotation of b-keratin regu-lators we inferred the major regulators of b-keratin genes intwo ways

Major TFs Predicted by Enrichment AnalysisFirst we identified over-represented regulators in eachclade of b-keratin genes (fig 3C also see supplementarytable S9 Supplementary Material online) and found thatthe clades of Chr2 Chr27 Chr25 claw and scale b-keratingenes have significantly enriched TFs (P valuelt 0001FDRlt 00001) clades 2 (having few Chr27 b-keratins)and 5 (having few Chr10 b-keratins) being very smallshowed no over-represented TFs Clade 8 in which all ker-atinocyte b-keratin genes are clustered also lacks over-represented TFs as keratinocyte b-keratin genes are regu-lated in small clusters together with scale claw and featherb-keratin genes In details among the 15 annotated TFsthat regulate Chr2 b-keratin genes (supplementary tableS8 Supplementary Material online) 4 specific TFs (all acti-vators) of 2 families (C2H2 and Ets) were over-representedin the Chr2 clade Similarly five specific TFs (four repressorsand one activator) of five families (AT hook forkhead Etsgrainyhead and C2H2 ZF) and five specific TFs (all activa-tors) of four families (AP-2 bHLH C2H2 ZF and bZIP) wereover-represented in the Chr27 and Chr25 clades respec-tively As mentioned earlier Chr25 and Chr27 b-keratingenes have common as well as unique TFs and we foundthat the five over-represented TFs of the Chr25 clade maybe categorized as 2 unique (ENSGALG00000023607 andENSGALG00000002393) and 3 (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) com-mon with Chr27 b-keratin genes This implies that thethree common over-represented TFs have more targetson Chr25 and fewer on Chr27 and therefore were notover-represented in the Chr27 clade Similarly the five

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2773

over-represented TFs of the Chr27 clade may be catego-rized as three unique (ENSGALG00000009774 ENSGALG00000011980 and ENSGALG00000008403) and twocommon (ENSGALG00000019647 and ENSGALG00000001577) with Chr25 This implies that the two over-represented TFs of the Chr27 clade common with Chr25mostly regulate Chr27 b-keratin genes along with muchfewer Chr25 b-keratin genes and therefore were notover-represented in the Chr25 clade Three TFs (all activa-tors) of one family (HD) and three TFs (one repressor andtwo activators) of three families (C2H2 ZF HD and T-box)were over-represented in the claw and scale b-keratinclades respectively Among them the repressor TF(ENSGALG00000007357) commonly regulates scale andclaw b-keratin genes As the repressor TF was over-represented in the scale clade it clearly implied that therepressors have more targets in scale b-keratin genes thanin claw b-keratin genes Overall the enrichment analysisprovided a set of major TFs narrowing down the relativelylarge network (supplementary table S8 SupplementaryMaterial online) to a few TFs (fig 3C)

Major TFs Predicted by Differential Expression of -Keratin Genes and Their pTFsWe conducted differential expression analysis of both b-ker-atin genes and their pTF genes (fig 4 also see supplementarytables S10 and S11 Supplementary Material online) to (1)cross check overall predicted TFndashtarget gene relationship (2)find additional major TFs regulating a small cluster of b-ker-atin genes of two clades or within a clade skipped by enrich-ment analysis and (3) understand the structural diversity offeather beyond the previous study (Ng et al 2014) We foundthat among the five different feather structures (EB LB EF LFand MF) Chr2 b-keratin genes were up-regulated in LFand MF (log fold change (FC) 259 Plt 002) due to theup-regulation of 12 of the 15 pTFs of Chr2 (including 3 of the4 over-represented TFs of the Chr2 clade) Similarly Chr25b-keratin genes were up-regulated in EB and LB (logFC 166 Plt 002) and Chr27 b-keratin genes were down-regulatedin LF (logFC258 Plt 002) due to up-regulation of somepTFs of Chr25 and down-regulation of some pTFs of Chr27 b-keratin genes respectively (4 of 9 unique pTF of Chr25 15 of25 unique pTF of Chr27 and 3 of 10 common pTFs of Chr25

FIG 4 Heatmap showing dynamic expression of b-keratin genes andtheir putative regulators The heatmap is based on an average expres-sion level (normalized) of the genes under each condition Low tohigh expression is shown in a gradient scale from blue to red (see thecolor bar at top) The entries on the top of the heatmap denote the

FIG 4 Continuedconditions from which the transcriptomes were obtained as detailedin figure S1 Supplementary Material online The entries in top middlethat interrupted the heatmap denote bndashkeratin genes on a chromo-some or in a subclass The extreme right entries denote bndashkeratins(green) corresponding to each chromosome or subclass (shown intop middle) and their corresponding TFs (black) As the expressiondifferences between TF and target genes differ in magnitude we usedindependent scales for them for better resolution of synchronousexpression TFs that regulate b-keratin genes of a particular subclass(scale claw or feather) along with few keratinocytes are consideredunder the particular subclass About 25 of the 81 pTFs did not showdifferential expression and are marked by ldquordquo symbol The over-rep-resented TFs are indicated by ldquordquo

Bhattacharjee et al doi101093molbevmsw165 MBE

2774

and Chr27 were not differentially expressed) While a fewof the Chr25 and Chr27 differentially expressed pTFs areunique (supplementary tables S8 and S11 SupplementaryMaterial online) others are common The differentiallyexpressed pTFs that commonly regulate Chr25 and Chr27b-keratin genes include three (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) of thefive over-represented TFs of the Chr25 clade but none ofthe Chr27 clade The unique differentially expressed TFs ofthe Chr25 clade include two over-represented TFs(ENSGALG00000023607 and ENSGALG00000002393) of theChr25 clade along with three other activators(ENSGALG00000003404 ENSGALG00000011055 andENSGALG00000023420) whereas the differentially expressedunique TFs of Chr27 b-keratin genes are mostly repressorsincluding two (ENSGALG00000009774 ENSGALG00000011980) of the four over-represented repressor TFs ofChr27 b-keratin genes This implies that for Chr25 b-keratingenes two unique over-represented TFs along with otherunique activators TFs effect up-regulation in EB and LB ForChr27 b-keratin genes two unique over-represented repres-sor TFs together with other unique repressor TFs effect down-regulation of b-keratin genes in LF which is the barbs andbarbule-less rachis portion of feather Interestingly the Chr7b-keratin gene together with scale claw and few keratinocyteb-keratin genes showed up-regulation in EB and EF (Chr7logFC 248 claw logFC 187 scale logFC 137 keratino-cytes logFC 108 Plt 002) relative to those in LB MF andLF Most pTFs of the scale claw keratinocyte and Chr7 b-

keratin genes including enriched regulators of claw and scaleb-keratin genes differentially expressed with their targetgenes

Differential expression of a pTF gene with the target genesupported 262 (out of total 372) predicted TFndashtarget generelationships involving 56 of the 81 pTFs (figs 2B and 4)Thus compared with the overall prediction (fig 2A) wefound for scale 13 putative TFs (pTFs)regulate 17 targetgenes for claw 7 pTFs9 target genes for keratinocyte 16pTFs9 target genes and for feather 38 pTFs56 targetgenes Twenty-five TFs (9 activators and 16 repressors)that mostly regulate feather b-keratin genes althoughthey satisfied PCCgt j08j with their target gene did notsatisfy differential gene expression with their target geneBased on this robust rule we identified 56 TFs that regulate91 b-keratin genes We found that b-keratin genes on achromosome or in a sub-class that formed clades haveover-represented TFs Differential expression analysis fur-ther supported 16 of the 20 predicted over-representedTFs and provided additional 10 major TFs that either reg-ulate two cladessub-classes or a sub-clade of b-keratingenes within a clade skipped by enrichment analysis Themajor TFs predicted by enrichment analysis (for clades) andfurther supported by differential analysis are summarized intable 1 which includes 22 activators but only four repres-sors The major regulators predicted herein are describedwith their unique and common tentative roles and areimportant candidates for functional characterization of b-keratin genes

Table 1 Inferred major regulators of b-keratin genes with putative roles The listed TFs are the key regulators of b-keratin genes predicted by acombined approach of clade-wise enrichment and synchronous differential expression of TF-target genes

TF gene ID TF Family Activatorrepressor

Enriched Diffexpressed

No oftarget genes

Tentative role

ENSGALG00000007056 C2H2 ZF activator Yes Yes 5 Uniquely regulates Chr2 b-keratinsENSGALG00000011435 Ets activator Yes Yes 4ENSGALG00000016927 C2H2 ZF activator Yes Yes 4ENSGALG00000009774 Ets repressor Yes Yes 15 Uniquely regulates Chr27 b-keratinsENSGALG00000011980 Grainyhead repressor Yes Yes 12ENSGALG00000000233 C2H2 ZF activator Yes Yes 17 Commonly regulates Chr25 and Chr27 b-keratinsENSGALG00000010101 C2H2 ZF activator Yes Yes 21ENSGALG00000012277 bZIP activator Yes Yes 17ENSGALG00000008014 bZIP activator No Yes 13ENSGALG00000006210 bHLH activator No Yes 4ENSGALG00000011053 Homeodomain activator No Yes 10ENSGALG00000000836 RFX activator No Yes 6ENSGALG00000023607 bHLH activator Yes Yes 10 Uniquely regulates Chr25 b-keratinsENSGALG00000002393 AP-2 activator Yes Yes 13ENSGALG00000003404 activator No Yes 3ENSGALG00000011055 Homeodomain activator No Yes 3ENSGALG00000001143 Ets activator No Yes 1 Uniquely regulates Chr10 b-keratinsENSGALG00000023742 bHLH activator No Yes 1ENSGALG00000023738 Homeodomain activator Yes Yes 9 Commonly regulates scale claw Chr7 b-keratinsENSGALG00000023195 Homeodomain activator No Yes 6ENSGALG00000013342 Homeodomain activator Yes Yes 6 Uniquely regulates claw b-keratinsENSGALG00000009129 Homeodomain activator Yes Yes 4ENSGALG00000007357 C2H2 ZF repressor Yes Yes 10 Commonly regulates scale and claw b-keratinsENSGALG00000000316 Homeodomain activator Yes Yes 5 Uniquely regulates scale b-keratinsENSGALG00000008250 T-box activator Yes Yes 6ENSGALG00000003759 Nuclear receptor repressor No Yes 3

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2775

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 3: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

regulate 17 target genes for claw 7 pTFs9 target genes forkeratinocytes 18 pTFs9 target genes and for feather 63pTFs56 target genes Interestingly feather b-keratins werefound regulated by many repressors The Venn diagram de-picts that each of the b-keratin sub-classes (scale claw ker-atinocytes and feather) have common as well as uniqueregulators (fig 2A) As the b-keratin genes expandedin birds by gene duplication the common and unique regu-lators found herein may be useful to unveil their regulatorytransition as discussed below

-Keratin Genes in Turkey and EvolutionaryConservation of Predicted TFBSWe further consider the condition that the TFBS present inthe promoter region of a gene has been conserved in thepromoter region of its orthologue in Turkey For this purposewe scanned the genome of Turkey (release 50) using chickenb-keratin genes as queries This approach identified 142 b-keratin genes in Turkey which includes 14 scale 12 claw 2keratinocyte and 114 feather b-keratin genes (supplementarytable S7 and fig S5 Supplementary Material online) Thuschicken and Turkey have similar numbers of b-keratin genesUnfortunately only 70 of the annotated Turkey b-keratingenes could be assessed for upstream 1000 bp and another10 genes for upstream 200-bp promoter sequences from theTurkey genome We found 192 predicted TFndashtarget gene

pairs (out of 372 pairs) satisfied the conservation criterion(supplementary table S6 and fig S5 SupplementaryMaterial online) Importantly the TFBS found in the pro-moter of a particular sub-class of b-keratin (scale claw ker-atinocyte and feather) in chicken was conserved on thepromoter of the corresponding sub-class in Turkey b-keratingenes This supported the predicted TFndashTFBS relationships ofb-keratin genes in chicken

Regulatory Transition of -Keratin GenesThe pTFndashtarget gene relationships of different sub-classes ofb-keratin genes are presented in figure 3A and sorted fromsupplementary tables S6ndashS8 Supplementary Material onlineWe also did maximum likelihood phylogenetic reconstruc-tion of b-keratins to understand the regulatory transitionassociated with the evolution of b-keratin genes (fig 3Band see supplementary fig S6 Supplementary Material onlinefor an expanded phylogeny) The phylogeny is concordantwith previous studies with scale and claw b-keratins being thebasal clade of feather b-keratins (Greenwold et al 2014 Zhaget al 2014)

Figure 3A shows that the basal scale and claw b-keratingenes have common as well as unique regulators SpecificTFs of homeodomain SMAD T-box AP-2 Sox nuclear re-ceptor and C2H2 ZF family uniquely regulate scale b-keratingenes (supplementary table S8 Supplementary Material

ClusterID

Enriched keratin related GO terms

Other enriched GO terms

-ker

atin

rel

ated

clu

ster

s

1 Yes Wnt signaling pathway developmental protein differentiation2 Yes Catalytic activity hydrolase activity mitochondrion protein stabilization5 No Endodermal cell differentiation ion transport signal transducer activity7 No Hydrolase activity acting on carbon-nitrogen bonds transmembrane signaling9 No Circadian rhythm response to endoplasmic reticulum stress chemotaxis

17 Yes NA21 Yes Microtubule-based process phosphatidylinositol-mediated signaling pigmentation23 Yes Protein stabilization hydrolase activity acting on carbon-nitrogen bond24 No Intracellular protein transport neuron apoptotic process cytoplasm25 Yes NA27 Yes NA28 Yes NA29 Yes NA30 Yes NA31 No Extracellular vesicular exosome fatty acid metabolic process translation32 Yes NA33 Yes NA34 Yes NA35 No Poly(A) RNA binding extracellular vesicular exosome36 Yes NA

Oth

er c

lust

ers

3

NA

Transmembrane transport sensory perception of pain4 Proteinaceous extracellular matrix signal transduction transaminase activity6 Membrane exocytosis synaptic vesicle G-protein coupled receptor activity8 Signal transducer activity immune system process locomotion11 Cell adhesion nucleolus poly(A) RNA binding heparin binding12 Protein dimerization activity organelle nucleus13 Potassium channel activity ion channel activity14 Immune response determination of leftright symmetry immune system process16 Calcium ion import phosphatidylinositol-mediated signaling19 Cofactor metabolic process cell death DNA helicase activity

LF1LF2

LF3EB1

EB2 MF2

MF3 MF1

EF3 EF1

EF2 EB3

LB1 LB2

LB3

1

2

3

4

5

6

7

8

9

11

12

13

14

16

17

192123

242527

2829

30

31

32

33

34

35

36b

A B

FIG 1 Hierarchical clustering and heatmap of gene-expression level and enriched functional categories in coexpression clusters (A) The clusteringof genes is based on expression similarity of the 2314 sorted genes in different feather parts The color bars on the left side denote the clustersobtained after cutting the hierarchical cluster tree at specific height (see supplementary fig S2 Supplementary Material online) The heatmaprepresents expression pattern of the gene in each cluster High expression is shown in red and low expression is shown in blue (see the color bar atthe left top) The entries below the heatmap denote transcriptome ID as given in figure S1 Supplementary Material online and are explained in thetext The clusters which have significantly enriched GO terms are numbered in the figure (B) The entries in the table represent the enrichedfunctions corresponding to the significant clusters The clusters are categorized as ldquob-keratin relatedrdquo and ldquoothersrdquo For keratin genes the child GOterms such as ldquofeatherrdquo ldquoscalerdquo ldquoclawrdquo ldquokeratinocytesrdquo and ldquointermediate filamentrdquo are assigned to its parental GO term ldquokeratinrdquo ldquoNArdquo denotesldquonot availablerdquo

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2771

online) Similarly specific TFs of the homeodomain familyuniquely regulate claw b-keratin genes A few TFs of home-odomain (ENSGALG00000023738 ENSGALG00000023195)and C2H2 ZF (ENSGALG00000016016 ENSGALG00000007357) commonly regulate scale and claw b-keratin genesAmong the above TFs two are repressors one being com-mon (ENSGALG00000007357 C2H2 ZF family) and theother (ENSGALG00000003759 nuclear receptor family)unique for scale As in Greenwold et al (2014) Chr7 featherb-keratin ENSGALG00000006133 (named as Chr7 in fig 3A)is basal of all feather b-keratins (fig 3B) Notably this gene is

regulated by two TFs of the homeodomain family(ENSGALG00000023738 ENSGALG00000023195) commonwith ancestral scale and claw b-keratin genesENSGALG00000023738 and ENSGALG00000023195 aremembers of the Iroquois (IRX-6) and Distal-less (DLX-6) ho-meobox 6 protein families respectively both having impor-tant roles in pattern formation limb development andappendage growth in vertebrates (Ogura et al 2001Velinov et al 2012) ENSGALG00000007357 andENSGALG00000003759 are B-cell lymphoma 6 (BCL6) andRAR-related orphan receptor A (RORALPHA1) protein

IRX6Iroquois-class homeodomain protein

91 Genes

56 TFs (42A 12R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

38 TFs (27A 11R)

Feather

9 Genes

16 TFs (14A 2R)

Keratinocyte

IRX6Iroquois-class homeodomain protein

91 Genes

81 TFs (51A 28R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

63 TFs (36A 27R)

Feather

9 Genes

18 TFs (16A 2R)

Keratinocyte

A B

FIG 2 Summary of annotated TFndashtarget gene relationships (A) Overall predictions and (B) predictions supported by differential expression ofpTFndashtarget gene The figure summarizes the number of annotated TFs and their corresponding numbers of target b-keratin genes The annotatedTFs are further classified according to the sub-class (scale claw keratinocytes and feather) of the b-keratin genes they regulate The entries ldquoArdquo andldquoRrdquo denotes ldquoactivatorrdquo and ldquorepressorrdquo respectively The Venn-diagram shows the numbers of common and unique regulators among the foursub-classes of b-keratins Iroquois-class homeodomain (IRX6) was found the sole TF protein which commonly regulates all the four sub-classes ofb-keratin genes

TFActivatorRepressor

ClawKeratinocyteScaleFeatherChr25 27

Chr2

Chr27

Chr10

Chr7

Enriched regulators in each cladeRed repressor Blue activator

Clade 3 Chr27ENSGALG00000008403 (AT hook)ENSGALG00000001577 (Forkhead) ENSGALG00000009774 (Ets)ENSGALG00000011980 (Grainyhead)ENSGALG00000019647 (C2H2 ZF)

Clade 4 Chr25ENSGALG00000002393 (AP-2) ENSGALG00000023607 (bHLH) ENSGALG00000000233 (C2H2 ZF) ENSGALG00000010101 (C2H2 ZF) ENSGALG00000012277 (bZIP)

Clade 7 ScaleENSGALG00000007357 (C2H2 ZF) ENSGALG00000000316 (HD) ENSGALG00000008250 (T-box)

Clade 6 ClawENSGALG00000023738 (HD) ENSGALG00000013342 (HD) ENSGALG00000009129 (HD)

Clade1 Chr2ENSGALG00000007056 (C2H2) ENSGALG00000011435 (Ets) ENSGALG00000016927 (C2H2) ENSGALG00000014696 (C2H2)

A B C

FIG 3 Regulatorndashtarget gene relationships phylogenetic tree of b-keratin genes and chromosomesub-class-wise enriched regulators of b-keratingenes (A) Regulators (TFrsquos) and their target genes A specific color is used to denote a sub-class of target genes (scale claw feather orkeratinocytes) TFs are denoted by red color and activators (gray arrows) and repressors (red T shapes) are shown in conventional notationsMost feather b-keratin genes have chromosome-wise specific regulators and only the Chr7 b-keratin gene shares a regulator with scale and claw b-keratin genes (B) Maximum likelihood phylogeny of the annotated b-keratin genes Scale and claw b-keratin genes are basal to feather b-keratingenes whereas the Chr7 b-keratin gene is basal to all feather b-keratin genes The clustering pattern of b-keratin genes showed consistency withthe regulation pattern (C) Enriched regulators of each clade of b-keratin genes identified by phylogenetic analysis The enrichment is based onfinding overrepresented regulators on each clade and is done to infer the major regulators of each clade that regulates multiple b-keratin genes

Bhattacharjee et al doi101093molbevmsw165 MBE

2772

respectively BCL6 has been reported to function as a repres-sor for germinal center formation and influences apoptosis(Huynh et al 2000) RORALPHA1 also has been reported as arepressor and is involved in organogenesis differentiationand circadian rhythm in vertebrates (Xiao et al 2015)This implies that the TFs taking part in structural frameworkof ancestral scale and claw as well as proto-feathers may havebeen recruited from related networks regulating pattern for-mation limb development appendage growth andorganogenesis

The remaining feather b-keratin genes showed consis-tent genomic organization clustering pattern and regu-lation distinct from the ancestral scale and claw b-keratingenes For example all Chr2 b-keratins are clustered to-gether as clade-1 and uniquely regulated by TFs of eightfamilies (Ets E2F C2H2 ZF nuclear receptor bZIP bHLHhomeodomain-POU and forkhead) The two biggest lociChr27 and 25 b-keratin genes formed two (clade-2 and 3)and one clade (clade-4) respectively Both Chr27 andChr25 b-keratin genes have common as well as uniqueregulators Specific TFs of six families (bHLH RFX homeo-domain C2H2 ZF bZIP and forkhead) commonly regulateChr25 and Chr27 b-keratin genes TFs of five families(bHLH homeodomain AP-2 E2F and C2H2 ZF) uniquelyregulate Chr25 b-keratin genes TFs of 15 families (ARIDBRIGHT Ets Rel C2H2 ZF bZIP grainyhead AT hookhomeodomain bHLH HSF TBP SMAD nuclear receptorDPE2F and forkhead) uniquely regulate Chr27 b-keratingenes Notably many of the Chr27 unique regulators arerepressors All Chr10 b-keratin genes were clustered asclade-5 although our approach annotated only oneChr10 b-keratin gene being regulated by specific TFs oftwo families (Ets and bHLH) Lastly keratinocyte b-keratingenes that together were clustered as clade-8 are foundhaving springy regulation common with scale claw andfeather b-keratin genes This is consistent with their di-verse roles related to scale claw and feather (Greenwoldet al 2014) It is important to note that the AnimalTranscription factor database annotated at least 858chicken genes as TFs (Zhang et al 2012) whereas theCIS-BP database (Weirauch et al 2014) has known or pre-dicted TFndashTFBS relationships for 496 TFs of chicken withpaucity of TFndashTFBS information for the remaining TFsFortunately the TFndashtarget gene relationships we have ob-tained include representatives of all genomic loci (exceptone feather b-keratin of Chr1 and one jun-transformed b-keratin of Chr6) and sub-classes for understanding thecharacteristic regulation of b-keratin genes

The typical regulation of b-keratin genes observed abovecan be explained by the following innovative regulatory tran-sition events First the transition from scale to claw or viceversa involved both retention and gainloss of TFBSsThe motifs corresponding to the common TFs are the casesof retention notably the common repressor motif forENSGALG00000007357 (C2H2 ZF family) might be importantto repress expression of scale and claw b-keratin genes infeather The unique repressor motif of scale b-keratins corre-sponding to ENSGALG00000003759 (nuclear receptor family)

might be a gain or loss event depending on whether the scaleor claw b-keratins were ancestral and might repress scale b-keratin gene expression in claw Second the Chr7 b-keratingene which is the first locus where feather b-keratin evolvedfrom a scale or claw b-keratin gene retains the ancestralcis motifs for ENSGALG00000023738 and ENSGALG00000023195 Third most of the feather b-keratin geneswere expanded from the basal Chr7 feather b-keratin geneand lost motifs recognized by ancestral scale and claw TFs andgained new motifs to perform new function In summaryfeather b-keratin genes underwent chromosome-wise expan-sions and recruited chromosome-wise regulators Similar tothe evolution of claw and scale b-keratin genes the transitionof feather b-keratin genes from Chr25 to Chr27 or the reverseinvolved both retention and gainloss of TFBSs as theyshowed common and unique regulation The characteristicregulatory control of feather b-keratin genes observed heremay be essential to effect their mosaic expression in diversefeather parts as has been suggested in a previous study (Nget al 2014)

Major Regulators of -Keratin GenesBy a major regulator we mean a TF that regulates multipleb-keratin genes Following the annotation of b-keratin regu-lators we inferred the major regulators of b-keratin genes intwo ways

Major TFs Predicted by Enrichment AnalysisFirst we identified over-represented regulators in eachclade of b-keratin genes (fig 3C also see supplementarytable S9 Supplementary Material online) and found thatthe clades of Chr2 Chr27 Chr25 claw and scale b-keratingenes have significantly enriched TFs (P valuelt 0001FDRlt 00001) clades 2 (having few Chr27 b-keratins)and 5 (having few Chr10 b-keratins) being very smallshowed no over-represented TFs Clade 8 in which all ker-atinocyte b-keratin genes are clustered also lacks over-represented TFs as keratinocyte b-keratin genes are regu-lated in small clusters together with scale claw and featherb-keratin genes In details among the 15 annotated TFsthat regulate Chr2 b-keratin genes (supplementary tableS8 Supplementary Material online) 4 specific TFs (all acti-vators) of 2 families (C2H2 and Ets) were over-representedin the Chr2 clade Similarly five specific TFs (four repressorsand one activator) of five families (AT hook forkhead Etsgrainyhead and C2H2 ZF) and five specific TFs (all activa-tors) of four families (AP-2 bHLH C2H2 ZF and bZIP) wereover-represented in the Chr27 and Chr25 clades respec-tively As mentioned earlier Chr25 and Chr27 b-keratingenes have common as well as unique TFs and we foundthat the five over-represented TFs of the Chr25 clade maybe categorized as 2 unique (ENSGALG00000023607 andENSGALG00000002393) and 3 (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) com-mon with Chr27 b-keratin genes This implies that thethree common over-represented TFs have more targetson Chr25 and fewer on Chr27 and therefore were notover-represented in the Chr27 clade Similarly the five

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2773

over-represented TFs of the Chr27 clade may be catego-rized as three unique (ENSGALG00000009774 ENSGALG00000011980 and ENSGALG00000008403) and twocommon (ENSGALG00000019647 and ENSGALG00000001577) with Chr25 This implies that the two over-represented TFs of the Chr27 clade common with Chr25mostly regulate Chr27 b-keratin genes along with muchfewer Chr25 b-keratin genes and therefore were notover-represented in the Chr25 clade Three TFs (all activa-tors) of one family (HD) and three TFs (one repressor andtwo activators) of three families (C2H2 ZF HD and T-box)were over-represented in the claw and scale b-keratinclades respectively Among them the repressor TF(ENSGALG00000007357) commonly regulates scale andclaw b-keratin genes As the repressor TF was over-represented in the scale clade it clearly implied that therepressors have more targets in scale b-keratin genes thanin claw b-keratin genes Overall the enrichment analysisprovided a set of major TFs narrowing down the relativelylarge network (supplementary table S8 SupplementaryMaterial online) to a few TFs (fig 3C)

Major TFs Predicted by Differential Expression of -Keratin Genes and Their pTFsWe conducted differential expression analysis of both b-ker-atin genes and their pTF genes (fig 4 also see supplementarytables S10 and S11 Supplementary Material online) to (1)cross check overall predicted TFndashtarget gene relationship (2)find additional major TFs regulating a small cluster of b-ker-atin genes of two clades or within a clade skipped by enrich-ment analysis and (3) understand the structural diversity offeather beyond the previous study (Ng et al 2014) We foundthat among the five different feather structures (EB LB EF LFand MF) Chr2 b-keratin genes were up-regulated in LFand MF (log fold change (FC) 259 Plt 002) due to theup-regulation of 12 of the 15 pTFs of Chr2 (including 3 of the4 over-represented TFs of the Chr2 clade) Similarly Chr25b-keratin genes were up-regulated in EB and LB (logFC 166 Plt 002) and Chr27 b-keratin genes were down-regulatedin LF (logFC258 Plt 002) due to up-regulation of somepTFs of Chr25 and down-regulation of some pTFs of Chr27 b-keratin genes respectively (4 of 9 unique pTF of Chr25 15 of25 unique pTF of Chr27 and 3 of 10 common pTFs of Chr25

FIG 4 Heatmap showing dynamic expression of b-keratin genes andtheir putative regulators The heatmap is based on an average expres-sion level (normalized) of the genes under each condition Low tohigh expression is shown in a gradient scale from blue to red (see thecolor bar at top) The entries on the top of the heatmap denote the

FIG 4 Continuedconditions from which the transcriptomes were obtained as detailedin figure S1 Supplementary Material online The entries in top middlethat interrupted the heatmap denote bndashkeratin genes on a chromo-some or in a subclass The extreme right entries denote bndashkeratins(green) corresponding to each chromosome or subclass (shown intop middle) and their corresponding TFs (black) As the expressiondifferences between TF and target genes differ in magnitude we usedindependent scales for them for better resolution of synchronousexpression TFs that regulate b-keratin genes of a particular subclass(scale claw or feather) along with few keratinocytes are consideredunder the particular subclass About 25 of the 81 pTFs did not showdifferential expression and are marked by ldquordquo symbol The over-rep-resented TFs are indicated by ldquordquo

Bhattacharjee et al doi101093molbevmsw165 MBE

2774

and Chr27 were not differentially expressed) While a fewof the Chr25 and Chr27 differentially expressed pTFs areunique (supplementary tables S8 and S11 SupplementaryMaterial online) others are common The differentiallyexpressed pTFs that commonly regulate Chr25 and Chr27b-keratin genes include three (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) of thefive over-represented TFs of the Chr25 clade but none ofthe Chr27 clade The unique differentially expressed TFs ofthe Chr25 clade include two over-represented TFs(ENSGALG00000023607 and ENSGALG00000002393) of theChr25 clade along with three other activators(ENSGALG00000003404 ENSGALG00000011055 andENSGALG00000023420) whereas the differentially expressedunique TFs of Chr27 b-keratin genes are mostly repressorsincluding two (ENSGALG00000009774 ENSGALG00000011980) of the four over-represented repressor TFs ofChr27 b-keratin genes This implies that for Chr25 b-keratingenes two unique over-represented TFs along with otherunique activators TFs effect up-regulation in EB and LB ForChr27 b-keratin genes two unique over-represented repres-sor TFs together with other unique repressor TFs effect down-regulation of b-keratin genes in LF which is the barbs andbarbule-less rachis portion of feather Interestingly the Chr7b-keratin gene together with scale claw and few keratinocyteb-keratin genes showed up-regulation in EB and EF (Chr7logFC 248 claw logFC 187 scale logFC 137 keratino-cytes logFC 108 Plt 002) relative to those in LB MF andLF Most pTFs of the scale claw keratinocyte and Chr7 b-

keratin genes including enriched regulators of claw and scaleb-keratin genes differentially expressed with their targetgenes

Differential expression of a pTF gene with the target genesupported 262 (out of total 372) predicted TFndashtarget generelationships involving 56 of the 81 pTFs (figs 2B and 4)Thus compared with the overall prediction (fig 2A) wefound for scale 13 putative TFs (pTFs)regulate 17 targetgenes for claw 7 pTFs9 target genes for keratinocyte 16pTFs9 target genes and for feather 38 pTFs56 targetgenes Twenty-five TFs (9 activators and 16 repressors)that mostly regulate feather b-keratin genes althoughthey satisfied PCCgt j08j with their target gene did notsatisfy differential gene expression with their target geneBased on this robust rule we identified 56 TFs that regulate91 b-keratin genes We found that b-keratin genes on achromosome or in a sub-class that formed clades haveover-represented TFs Differential expression analysis fur-ther supported 16 of the 20 predicted over-representedTFs and provided additional 10 major TFs that either reg-ulate two cladessub-classes or a sub-clade of b-keratingenes within a clade skipped by enrichment analysis Themajor TFs predicted by enrichment analysis (for clades) andfurther supported by differential analysis are summarized intable 1 which includes 22 activators but only four repres-sors The major regulators predicted herein are describedwith their unique and common tentative roles and areimportant candidates for functional characterization of b-keratin genes

Table 1 Inferred major regulators of b-keratin genes with putative roles The listed TFs are the key regulators of b-keratin genes predicted by acombined approach of clade-wise enrichment and synchronous differential expression of TF-target genes

TF gene ID TF Family Activatorrepressor

Enriched Diffexpressed

No oftarget genes

Tentative role

ENSGALG00000007056 C2H2 ZF activator Yes Yes 5 Uniquely regulates Chr2 b-keratinsENSGALG00000011435 Ets activator Yes Yes 4ENSGALG00000016927 C2H2 ZF activator Yes Yes 4ENSGALG00000009774 Ets repressor Yes Yes 15 Uniquely regulates Chr27 b-keratinsENSGALG00000011980 Grainyhead repressor Yes Yes 12ENSGALG00000000233 C2H2 ZF activator Yes Yes 17 Commonly regulates Chr25 and Chr27 b-keratinsENSGALG00000010101 C2H2 ZF activator Yes Yes 21ENSGALG00000012277 bZIP activator Yes Yes 17ENSGALG00000008014 bZIP activator No Yes 13ENSGALG00000006210 bHLH activator No Yes 4ENSGALG00000011053 Homeodomain activator No Yes 10ENSGALG00000000836 RFX activator No Yes 6ENSGALG00000023607 bHLH activator Yes Yes 10 Uniquely regulates Chr25 b-keratinsENSGALG00000002393 AP-2 activator Yes Yes 13ENSGALG00000003404 activator No Yes 3ENSGALG00000011055 Homeodomain activator No Yes 3ENSGALG00000001143 Ets activator No Yes 1 Uniquely regulates Chr10 b-keratinsENSGALG00000023742 bHLH activator No Yes 1ENSGALG00000023738 Homeodomain activator Yes Yes 9 Commonly regulates scale claw Chr7 b-keratinsENSGALG00000023195 Homeodomain activator No Yes 6ENSGALG00000013342 Homeodomain activator Yes Yes 6 Uniquely regulates claw b-keratinsENSGALG00000009129 Homeodomain activator Yes Yes 4ENSGALG00000007357 C2H2 ZF repressor Yes Yes 10 Commonly regulates scale and claw b-keratinsENSGALG00000000316 Homeodomain activator Yes Yes 5 Uniquely regulates scale b-keratinsENSGALG00000008250 T-box activator Yes Yes 6ENSGALG00000003759 Nuclear receptor repressor No Yes 3

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2775

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 4: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

online) Similarly specific TFs of the homeodomain familyuniquely regulate claw b-keratin genes A few TFs of home-odomain (ENSGALG00000023738 ENSGALG00000023195)and C2H2 ZF (ENSGALG00000016016 ENSGALG00000007357) commonly regulate scale and claw b-keratin genesAmong the above TFs two are repressors one being com-mon (ENSGALG00000007357 C2H2 ZF family) and theother (ENSGALG00000003759 nuclear receptor family)unique for scale As in Greenwold et al (2014) Chr7 featherb-keratin ENSGALG00000006133 (named as Chr7 in fig 3A)is basal of all feather b-keratins (fig 3B) Notably this gene is

regulated by two TFs of the homeodomain family(ENSGALG00000023738 ENSGALG00000023195) commonwith ancestral scale and claw b-keratin genesENSGALG00000023738 and ENSGALG00000023195 aremembers of the Iroquois (IRX-6) and Distal-less (DLX-6) ho-meobox 6 protein families respectively both having impor-tant roles in pattern formation limb development andappendage growth in vertebrates (Ogura et al 2001Velinov et al 2012) ENSGALG00000007357 andENSGALG00000003759 are B-cell lymphoma 6 (BCL6) andRAR-related orphan receptor A (RORALPHA1) protein

IRX6Iroquois-class homeodomain protein

91 Genes

56 TFs (42A 12R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

38 TFs (27A 11R)

Feather

9 Genes

16 TFs (14A 2R)

Keratinocyte

IRX6Iroquois-class homeodomain protein

91 Genes

81 TFs (51A 28R 2A-R)

TFs that regulate b-keratin genes

17 Genes

13 TFs (11A 2R)

Scale

9 Genes

7 TFs (6A 1R)

Claw

56 Genes

63 TFs (36A 27R)

Feather

9 Genes

18 TFs (16A 2R)

Keratinocyte

A B

FIG 2 Summary of annotated TFndashtarget gene relationships (A) Overall predictions and (B) predictions supported by differential expression ofpTFndashtarget gene The figure summarizes the number of annotated TFs and their corresponding numbers of target b-keratin genes The annotatedTFs are further classified according to the sub-class (scale claw keratinocytes and feather) of the b-keratin genes they regulate The entries ldquoArdquo andldquoRrdquo denotes ldquoactivatorrdquo and ldquorepressorrdquo respectively The Venn-diagram shows the numbers of common and unique regulators among the foursub-classes of b-keratins Iroquois-class homeodomain (IRX6) was found the sole TF protein which commonly regulates all the four sub-classes ofb-keratin genes

TFActivatorRepressor

ClawKeratinocyteScaleFeatherChr25 27

Chr2

Chr27

Chr10

Chr7

Enriched regulators in each cladeRed repressor Blue activator

Clade 3 Chr27ENSGALG00000008403 (AT hook)ENSGALG00000001577 (Forkhead) ENSGALG00000009774 (Ets)ENSGALG00000011980 (Grainyhead)ENSGALG00000019647 (C2H2 ZF)

Clade 4 Chr25ENSGALG00000002393 (AP-2) ENSGALG00000023607 (bHLH) ENSGALG00000000233 (C2H2 ZF) ENSGALG00000010101 (C2H2 ZF) ENSGALG00000012277 (bZIP)

Clade 7 ScaleENSGALG00000007357 (C2H2 ZF) ENSGALG00000000316 (HD) ENSGALG00000008250 (T-box)

Clade 6 ClawENSGALG00000023738 (HD) ENSGALG00000013342 (HD) ENSGALG00000009129 (HD)

Clade1 Chr2ENSGALG00000007056 (C2H2) ENSGALG00000011435 (Ets) ENSGALG00000016927 (C2H2) ENSGALG00000014696 (C2H2)

A B C

FIG 3 Regulatorndashtarget gene relationships phylogenetic tree of b-keratin genes and chromosomesub-class-wise enriched regulators of b-keratingenes (A) Regulators (TFrsquos) and their target genes A specific color is used to denote a sub-class of target genes (scale claw feather orkeratinocytes) TFs are denoted by red color and activators (gray arrows) and repressors (red T shapes) are shown in conventional notationsMost feather b-keratin genes have chromosome-wise specific regulators and only the Chr7 b-keratin gene shares a regulator with scale and claw b-keratin genes (B) Maximum likelihood phylogeny of the annotated b-keratin genes Scale and claw b-keratin genes are basal to feather b-keratingenes whereas the Chr7 b-keratin gene is basal to all feather b-keratin genes The clustering pattern of b-keratin genes showed consistency withthe regulation pattern (C) Enriched regulators of each clade of b-keratin genes identified by phylogenetic analysis The enrichment is based onfinding overrepresented regulators on each clade and is done to infer the major regulators of each clade that regulates multiple b-keratin genes

Bhattacharjee et al doi101093molbevmsw165 MBE

2772

respectively BCL6 has been reported to function as a repres-sor for germinal center formation and influences apoptosis(Huynh et al 2000) RORALPHA1 also has been reported as arepressor and is involved in organogenesis differentiationand circadian rhythm in vertebrates (Xiao et al 2015)This implies that the TFs taking part in structural frameworkof ancestral scale and claw as well as proto-feathers may havebeen recruited from related networks regulating pattern for-mation limb development appendage growth andorganogenesis

The remaining feather b-keratin genes showed consis-tent genomic organization clustering pattern and regu-lation distinct from the ancestral scale and claw b-keratingenes For example all Chr2 b-keratins are clustered to-gether as clade-1 and uniquely regulated by TFs of eightfamilies (Ets E2F C2H2 ZF nuclear receptor bZIP bHLHhomeodomain-POU and forkhead) The two biggest lociChr27 and 25 b-keratin genes formed two (clade-2 and 3)and one clade (clade-4) respectively Both Chr27 andChr25 b-keratin genes have common as well as uniqueregulators Specific TFs of six families (bHLH RFX homeo-domain C2H2 ZF bZIP and forkhead) commonly regulateChr25 and Chr27 b-keratin genes TFs of five families(bHLH homeodomain AP-2 E2F and C2H2 ZF) uniquelyregulate Chr25 b-keratin genes TFs of 15 families (ARIDBRIGHT Ets Rel C2H2 ZF bZIP grainyhead AT hookhomeodomain bHLH HSF TBP SMAD nuclear receptorDPE2F and forkhead) uniquely regulate Chr27 b-keratingenes Notably many of the Chr27 unique regulators arerepressors All Chr10 b-keratin genes were clustered asclade-5 although our approach annotated only oneChr10 b-keratin gene being regulated by specific TFs oftwo families (Ets and bHLH) Lastly keratinocyte b-keratingenes that together were clustered as clade-8 are foundhaving springy regulation common with scale claw andfeather b-keratin genes This is consistent with their di-verse roles related to scale claw and feather (Greenwoldet al 2014) It is important to note that the AnimalTranscription factor database annotated at least 858chicken genes as TFs (Zhang et al 2012) whereas theCIS-BP database (Weirauch et al 2014) has known or pre-dicted TFndashTFBS relationships for 496 TFs of chicken withpaucity of TFndashTFBS information for the remaining TFsFortunately the TFndashtarget gene relationships we have ob-tained include representatives of all genomic loci (exceptone feather b-keratin of Chr1 and one jun-transformed b-keratin of Chr6) and sub-classes for understanding thecharacteristic regulation of b-keratin genes

The typical regulation of b-keratin genes observed abovecan be explained by the following innovative regulatory tran-sition events First the transition from scale to claw or viceversa involved both retention and gainloss of TFBSsThe motifs corresponding to the common TFs are the casesof retention notably the common repressor motif forENSGALG00000007357 (C2H2 ZF family) might be importantto repress expression of scale and claw b-keratin genes infeather The unique repressor motif of scale b-keratins corre-sponding to ENSGALG00000003759 (nuclear receptor family)

might be a gain or loss event depending on whether the scaleor claw b-keratins were ancestral and might repress scale b-keratin gene expression in claw Second the Chr7 b-keratingene which is the first locus where feather b-keratin evolvedfrom a scale or claw b-keratin gene retains the ancestralcis motifs for ENSGALG00000023738 and ENSGALG00000023195 Third most of the feather b-keratin geneswere expanded from the basal Chr7 feather b-keratin geneand lost motifs recognized by ancestral scale and claw TFs andgained new motifs to perform new function In summaryfeather b-keratin genes underwent chromosome-wise expan-sions and recruited chromosome-wise regulators Similar tothe evolution of claw and scale b-keratin genes the transitionof feather b-keratin genes from Chr25 to Chr27 or the reverseinvolved both retention and gainloss of TFBSs as theyshowed common and unique regulation The characteristicregulatory control of feather b-keratin genes observed heremay be essential to effect their mosaic expression in diversefeather parts as has been suggested in a previous study (Nget al 2014)

Major Regulators of -Keratin GenesBy a major regulator we mean a TF that regulates multipleb-keratin genes Following the annotation of b-keratin regu-lators we inferred the major regulators of b-keratin genes intwo ways

Major TFs Predicted by Enrichment AnalysisFirst we identified over-represented regulators in eachclade of b-keratin genes (fig 3C also see supplementarytable S9 Supplementary Material online) and found thatthe clades of Chr2 Chr27 Chr25 claw and scale b-keratingenes have significantly enriched TFs (P valuelt 0001FDRlt 00001) clades 2 (having few Chr27 b-keratins)and 5 (having few Chr10 b-keratins) being very smallshowed no over-represented TFs Clade 8 in which all ker-atinocyte b-keratin genes are clustered also lacks over-represented TFs as keratinocyte b-keratin genes are regu-lated in small clusters together with scale claw and featherb-keratin genes In details among the 15 annotated TFsthat regulate Chr2 b-keratin genes (supplementary tableS8 Supplementary Material online) 4 specific TFs (all acti-vators) of 2 families (C2H2 and Ets) were over-representedin the Chr2 clade Similarly five specific TFs (four repressorsand one activator) of five families (AT hook forkhead Etsgrainyhead and C2H2 ZF) and five specific TFs (all activa-tors) of four families (AP-2 bHLH C2H2 ZF and bZIP) wereover-represented in the Chr27 and Chr25 clades respec-tively As mentioned earlier Chr25 and Chr27 b-keratingenes have common as well as unique TFs and we foundthat the five over-represented TFs of the Chr25 clade maybe categorized as 2 unique (ENSGALG00000023607 andENSGALG00000002393) and 3 (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) com-mon with Chr27 b-keratin genes This implies that thethree common over-represented TFs have more targetson Chr25 and fewer on Chr27 and therefore were notover-represented in the Chr27 clade Similarly the five

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2773

over-represented TFs of the Chr27 clade may be catego-rized as three unique (ENSGALG00000009774 ENSGALG00000011980 and ENSGALG00000008403) and twocommon (ENSGALG00000019647 and ENSGALG00000001577) with Chr25 This implies that the two over-represented TFs of the Chr27 clade common with Chr25mostly regulate Chr27 b-keratin genes along with muchfewer Chr25 b-keratin genes and therefore were notover-represented in the Chr25 clade Three TFs (all activa-tors) of one family (HD) and three TFs (one repressor andtwo activators) of three families (C2H2 ZF HD and T-box)were over-represented in the claw and scale b-keratinclades respectively Among them the repressor TF(ENSGALG00000007357) commonly regulates scale andclaw b-keratin genes As the repressor TF was over-represented in the scale clade it clearly implied that therepressors have more targets in scale b-keratin genes thanin claw b-keratin genes Overall the enrichment analysisprovided a set of major TFs narrowing down the relativelylarge network (supplementary table S8 SupplementaryMaterial online) to a few TFs (fig 3C)

Major TFs Predicted by Differential Expression of -Keratin Genes and Their pTFsWe conducted differential expression analysis of both b-ker-atin genes and their pTF genes (fig 4 also see supplementarytables S10 and S11 Supplementary Material online) to (1)cross check overall predicted TFndashtarget gene relationship (2)find additional major TFs regulating a small cluster of b-ker-atin genes of two clades or within a clade skipped by enrich-ment analysis and (3) understand the structural diversity offeather beyond the previous study (Ng et al 2014) We foundthat among the five different feather structures (EB LB EF LFand MF) Chr2 b-keratin genes were up-regulated in LFand MF (log fold change (FC) 259 Plt 002) due to theup-regulation of 12 of the 15 pTFs of Chr2 (including 3 of the4 over-represented TFs of the Chr2 clade) Similarly Chr25b-keratin genes were up-regulated in EB and LB (logFC 166 Plt 002) and Chr27 b-keratin genes were down-regulatedin LF (logFC258 Plt 002) due to up-regulation of somepTFs of Chr25 and down-regulation of some pTFs of Chr27 b-keratin genes respectively (4 of 9 unique pTF of Chr25 15 of25 unique pTF of Chr27 and 3 of 10 common pTFs of Chr25

FIG 4 Heatmap showing dynamic expression of b-keratin genes andtheir putative regulators The heatmap is based on an average expres-sion level (normalized) of the genes under each condition Low tohigh expression is shown in a gradient scale from blue to red (see thecolor bar at top) The entries on the top of the heatmap denote the

FIG 4 Continuedconditions from which the transcriptomes were obtained as detailedin figure S1 Supplementary Material online The entries in top middlethat interrupted the heatmap denote bndashkeratin genes on a chromo-some or in a subclass The extreme right entries denote bndashkeratins(green) corresponding to each chromosome or subclass (shown intop middle) and their corresponding TFs (black) As the expressiondifferences between TF and target genes differ in magnitude we usedindependent scales for them for better resolution of synchronousexpression TFs that regulate b-keratin genes of a particular subclass(scale claw or feather) along with few keratinocytes are consideredunder the particular subclass About 25 of the 81 pTFs did not showdifferential expression and are marked by ldquordquo symbol The over-rep-resented TFs are indicated by ldquordquo

Bhattacharjee et al doi101093molbevmsw165 MBE

2774

and Chr27 were not differentially expressed) While a fewof the Chr25 and Chr27 differentially expressed pTFs areunique (supplementary tables S8 and S11 SupplementaryMaterial online) others are common The differentiallyexpressed pTFs that commonly regulate Chr25 and Chr27b-keratin genes include three (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) of thefive over-represented TFs of the Chr25 clade but none ofthe Chr27 clade The unique differentially expressed TFs ofthe Chr25 clade include two over-represented TFs(ENSGALG00000023607 and ENSGALG00000002393) of theChr25 clade along with three other activators(ENSGALG00000003404 ENSGALG00000011055 andENSGALG00000023420) whereas the differentially expressedunique TFs of Chr27 b-keratin genes are mostly repressorsincluding two (ENSGALG00000009774 ENSGALG00000011980) of the four over-represented repressor TFs ofChr27 b-keratin genes This implies that for Chr25 b-keratingenes two unique over-represented TFs along with otherunique activators TFs effect up-regulation in EB and LB ForChr27 b-keratin genes two unique over-represented repres-sor TFs together with other unique repressor TFs effect down-regulation of b-keratin genes in LF which is the barbs andbarbule-less rachis portion of feather Interestingly the Chr7b-keratin gene together with scale claw and few keratinocyteb-keratin genes showed up-regulation in EB and EF (Chr7logFC 248 claw logFC 187 scale logFC 137 keratino-cytes logFC 108 Plt 002) relative to those in LB MF andLF Most pTFs of the scale claw keratinocyte and Chr7 b-

keratin genes including enriched regulators of claw and scaleb-keratin genes differentially expressed with their targetgenes

Differential expression of a pTF gene with the target genesupported 262 (out of total 372) predicted TFndashtarget generelationships involving 56 of the 81 pTFs (figs 2B and 4)Thus compared with the overall prediction (fig 2A) wefound for scale 13 putative TFs (pTFs)regulate 17 targetgenes for claw 7 pTFs9 target genes for keratinocyte 16pTFs9 target genes and for feather 38 pTFs56 targetgenes Twenty-five TFs (9 activators and 16 repressors)that mostly regulate feather b-keratin genes althoughthey satisfied PCCgt j08j with their target gene did notsatisfy differential gene expression with their target geneBased on this robust rule we identified 56 TFs that regulate91 b-keratin genes We found that b-keratin genes on achromosome or in a sub-class that formed clades haveover-represented TFs Differential expression analysis fur-ther supported 16 of the 20 predicted over-representedTFs and provided additional 10 major TFs that either reg-ulate two cladessub-classes or a sub-clade of b-keratingenes within a clade skipped by enrichment analysis Themajor TFs predicted by enrichment analysis (for clades) andfurther supported by differential analysis are summarized intable 1 which includes 22 activators but only four repres-sors The major regulators predicted herein are describedwith their unique and common tentative roles and areimportant candidates for functional characterization of b-keratin genes

Table 1 Inferred major regulators of b-keratin genes with putative roles The listed TFs are the key regulators of b-keratin genes predicted by acombined approach of clade-wise enrichment and synchronous differential expression of TF-target genes

TF gene ID TF Family Activatorrepressor

Enriched Diffexpressed

No oftarget genes

Tentative role

ENSGALG00000007056 C2H2 ZF activator Yes Yes 5 Uniquely regulates Chr2 b-keratinsENSGALG00000011435 Ets activator Yes Yes 4ENSGALG00000016927 C2H2 ZF activator Yes Yes 4ENSGALG00000009774 Ets repressor Yes Yes 15 Uniquely regulates Chr27 b-keratinsENSGALG00000011980 Grainyhead repressor Yes Yes 12ENSGALG00000000233 C2H2 ZF activator Yes Yes 17 Commonly regulates Chr25 and Chr27 b-keratinsENSGALG00000010101 C2H2 ZF activator Yes Yes 21ENSGALG00000012277 bZIP activator Yes Yes 17ENSGALG00000008014 bZIP activator No Yes 13ENSGALG00000006210 bHLH activator No Yes 4ENSGALG00000011053 Homeodomain activator No Yes 10ENSGALG00000000836 RFX activator No Yes 6ENSGALG00000023607 bHLH activator Yes Yes 10 Uniquely regulates Chr25 b-keratinsENSGALG00000002393 AP-2 activator Yes Yes 13ENSGALG00000003404 activator No Yes 3ENSGALG00000011055 Homeodomain activator No Yes 3ENSGALG00000001143 Ets activator No Yes 1 Uniquely regulates Chr10 b-keratinsENSGALG00000023742 bHLH activator No Yes 1ENSGALG00000023738 Homeodomain activator Yes Yes 9 Commonly regulates scale claw Chr7 b-keratinsENSGALG00000023195 Homeodomain activator No Yes 6ENSGALG00000013342 Homeodomain activator Yes Yes 6 Uniquely regulates claw b-keratinsENSGALG00000009129 Homeodomain activator Yes Yes 4ENSGALG00000007357 C2H2 ZF repressor Yes Yes 10 Commonly regulates scale and claw b-keratinsENSGALG00000000316 Homeodomain activator Yes Yes 5 Uniquely regulates scale b-keratinsENSGALG00000008250 T-box activator Yes Yes 6ENSGALG00000003759 Nuclear receptor repressor No Yes 3

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2775

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 5: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

respectively BCL6 has been reported to function as a repres-sor for germinal center formation and influences apoptosis(Huynh et al 2000) RORALPHA1 also has been reported as arepressor and is involved in organogenesis differentiationand circadian rhythm in vertebrates (Xiao et al 2015)This implies that the TFs taking part in structural frameworkof ancestral scale and claw as well as proto-feathers may havebeen recruited from related networks regulating pattern for-mation limb development appendage growth andorganogenesis

The remaining feather b-keratin genes showed consis-tent genomic organization clustering pattern and regu-lation distinct from the ancestral scale and claw b-keratingenes For example all Chr2 b-keratins are clustered to-gether as clade-1 and uniquely regulated by TFs of eightfamilies (Ets E2F C2H2 ZF nuclear receptor bZIP bHLHhomeodomain-POU and forkhead) The two biggest lociChr27 and 25 b-keratin genes formed two (clade-2 and 3)and one clade (clade-4) respectively Both Chr27 andChr25 b-keratin genes have common as well as uniqueregulators Specific TFs of six families (bHLH RFX homeo-domain C2H2 ZF bZIP and forkhead) commonly regulateChr25 and Chr27 b-keratin genes TFs of five families(bHLH homeodomain AP-2 E2F and C2H2 ZF) uniquelyregulate Chr25 b-keratin genes TFs of 15 families (ARIDBRIGHT Ets Rel C2H2 ZF bZIP grainyhead AT hookhomeodomain bHLH HSF TBP SMAD nuclear receptorDPE2F and forkhead) uniquely regulate Chr27 b-keratingenes Notably many of the Chr27 unique regulators arerepressors All Chr10 b-keratin genes were clustered asclade-5 although our approach annotated only oneChr10 b-keratin gene being regulated by specific TFs oftwo families (Ets and bHLH) Lastly keratinocyte b-keratingenes that together were clustered as clade-8 are foundhaving springy regulation common with scale claw andfeather b-keratin genes This is consistent with their di-verse roles related to scale claw and feather (Greenwoldet al 2014) It is important to note that the AnimalTranscription factor database annotated at least 858chicken genes as TFs (Zhang et al 2012) whereas theCIS-BP database (Weirauch et al 2014) has known or pre-dicted TFndashTFBS relationships for 496 TFs of chicken withpaucity of TFndashTFBS information for the remaining TFsFortunately the TFndashtarget gene relationships we have ob-tained include representatives of all genomic loci (exceptone feather b-keratin of Chr1 and one jun-transformed b-keratin of Chr6) and sub-classes for understanding thecharacteristic regulation of b-keratin genes

The typical regulation of b-keratin genes observed abovecan be explained by the following innovative regulatory tran-sition events First the transition from scale to claw or viceversa involved both retention and gainloss of TFBSsThe motifs corresponding to the common TFs are the casesof retention notably the common repressor motif forENSGALG00000007357 (C2H2 ZF family) might be importantto repress expression of scale and claw b-keratin genes infeather The unique repressor motif of scale b-keratins corre-sponding to ENSGALG00000003759 (nuclear receptor family)

might be a gain or loss event depending on whether the scaleor claw b-keratins were ancestral and might repress scale b-keratin gene expression in claw Second the Chr7 b-keratingene which is the first locus where feather b-keratin evolvedfrom a scale or claw b-keratin gene retains the ancestralcis motifs for ENSGALG00000023738 and ENSGALG00000023195 Third most of the feather b-keratin geneswere expanded from the basal Chr7 feather b-keratin geneand lost motifs recognized by ancestral scale and claw TFs andgained new motifs to perform new function In summaryfeather b-keratin genes underwent chromosome-wise expan-sions and recruited chromosome-wise regulators Similar tothe evolution of claw and scale b-keratin genes the transitionof feather b-keratin genes from Chr25 to Chr27 or the reverseinvolved both retention and gainloss of TFBSs as theyshowed common and unique regulation The characteristicregulatory control of feather b-keratin genes observed heremay be essential to effect their mosaic expression in diversefeather parts as has been suggested in a previous study (Nget al 2014)

Major Regulators of -Keratin GenesBy a major regulator we mean a TF that regulates multipleb-keratin genes Following the annotation of b-keratin regu-lators we inferred the major regulators of b-keratin genes intwo ways

Major TFs Predicted by Enrichment AnalysisFirst we identified over-represented regulators in eachclade of b-keratin genes (fig 3C also see supplementarytable S9 Supplementary Material online) and found thatthe clades of Chr2 Chr27 Chr25 claw and scale b-keratingenes have significantly enriched TFs (P valuelt 0001FDRlt 00001) clades 2 (having few Chr27 b-keratins)and 5 (having few Chr10 b-keratins) being very smallshowed no over-represented TFs Clade 8 in which all ker-atinocyte b-keratin genes are clustered also lacks over-represented TFs as keratinocyte b-keratin genes are regu-lated in small clusters together with scale claw and featherb-keratin genes In details among the 15 annotated TFsthat regulate Chr2 b-keratin genes (supplementary tableS8 Supplementary Material online) 4 specific TFs (all acti-vators) of 2 families (C2H2 and Ets) were over-representedin the Chr2 clade Similarly five specific TFs (four repressorsand one activator) of five families (AT hook forkhead Etsgrainyhead and C2H2 ZF) and five specific TFs (all activa-tors) of four families (AP-2 bHLH C2H2 ZF and bZIP) wereover-represented in the Chr27 and Chr25 clades respec-tively As mentioned earlier Chr25 and Chr27 b-keratingenes have common as well as unique TFs and we foundthat the five over-represented TFs of the Chr25 clade maybe categorized as 2 unique (ENSGALG00000023607 andENSGALG00000002393) and 3 (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) com-mon with Chr27 b-keratin genes This implies that thethree common over-represented TFs have more targetson Chr25 and fewer on Chr27 and therefore were notover-represented in the Chr27 clade Similarly the five

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2773

over-represented TFs of the Chr27 clade may be catego-rized as three unique (ENSGALG00000009774 ENSGALG00000011980 and ENSGALG00000008403) and twocommon (ENSGALG00000019647 and ENSGALG00000001577) with Chr25 This implies that the two over-represented TFs of the Chr27 clade common with Chr25mostly regulate Chr27 b-keratin genes along with muchfewer Chr25 b-keratin genes and therefore were notover-represented in the Chr25 clade Three TFs (all activa-tors) of one family (HD) and three TFs (one repressor andtwo activators) of three families (C2H2 ZF HD and T-box)were over-represented in the claw and scale b-keratinclades respectively Among them the repressor TF(ENSGALG00000007357) commonly regulates scale andclaw b-keratin genes As the repressor TF was over-represented in the scale clade it clearly implied that therepressors have more targets in scale b-keratin genes thanin claw b-keratin genes Overall the enrichment analysisprovided a set of major TFs narrowing down the relativelylarge network (supplementary table S8 SupplementaryMaterial online) to a few TFs (fig 3C)

Major TFs Predicted by Differential Expression of -Keratin Genes and Their pTFsWe conducted differential expression analysis of both b-ker-atin genes and their pTF genes (fig 4 also see supplementarytables S10 and S11 Supplementary Material online) to (1)cross check overall predicted TFndashtarget gene relationship (2)find additional major TFs regulating a small cluster of b-ker-atin genes of two clades or within a clade skipped by enrich-ment analysis and (3) understand the structural diversity offeather beyond the previous study (Ng et al 2014) We foundthat among the five different feather structures (EB LB EF LFand MF) Chr2 b-keratin genes were up-regulated in LFand MF (log fold change (FC) 259 Plt 002) due to theup-regulation of 12 of the 15 pTFs of Chr2 (including 3 of the4 over-represented TFs of the Chr2 clade) Similarly Chr25b-keratin genes were up-regulated in EB and LB (logFC 166 Plt 002) and Chr27 b-keratin genes were down-regulatedin LF (logFC258 Plt 002) due to up-regulation of somepTFs of Chr25 and down-regulation of some pTFs of Chr27 b-keratin genes respectively (4 of 9 unique pTF of Chr25 15 of25 unique pTF of Chr27 and 3 of 10 common pTFs of Chr25

FIG 4 Heatmap showing dynamic expression of b-keratin genes andtheir putative regulators The heatmap is based on an average expres-sion level (normalized) of the genes under each condition Low tohigh expression is shown in a gradient scale from blue to red (see thecolor bar at top) The entries on the top of the heatmap denote the

FIG 4 Continuedconditions from which the transcriptomes were obtained as detailedin figure S1 Supplementary Material online The entries in top middlethat interrupted the heatmap denote bndashkeratin genes on a chromo-some or in a subclass The extreme right entries denote bndashkeratins(green) corresponding to each chromosome or subclass (shown intop middle) and their corresponding TFs (black) As the expressiondifferences between TF and target genes differ in magnitude we usedindependent scales for them for better resolution of synchronousexpression TFs that regulate b-keratin genes of a particular subclass(scale claw or feather) along with few keratinocytes are consideredunder the particular subclass About 25 of the 81 pTFs did not showdifferential expression and are marked by ldquordquo symbol The over-rep-resented TFs are indicated by ldquordquo

Bhattacharjee et al doi101093molbevmsw165 MBE

2774

and Chr27 were not differentially expressed) While a fewof the Chr25 and Chr27 differentially expressed pTFs areunique (supplementary tables S8 and S11 SupplementaryMaterial online) others are common The differentiallyexpressed pTFs that commonly regulate Chr25 and Chr27b-keratin genes include three (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) of thefive over-represented TFs of the Chr25 clade but none ofthe Chr27 clade The unique differentially expressed TFs ofthe Chr25 clade include two over-represented TFs(ENSGALG00000023607 and ENSGALG00000002393) of theChr25 clade along with three other activators(ENSGALG00000003404 ENSGALG00000011055 andENSGALG00000023420) whereas the differentially expressedunique TFs of Chr27 b-keratin genes are mostly repressorsincluding two (ENSGALG00000009774 ENSGALG00000011980) of the four over-represented repressor TFs ofChr27 b-keratin genes This implies that for Chr25 b-keratingenes two unique over-represented TFs along with otherunique activators TFs effect up-regulation in EB and LB ForChr27 b-keratin genes two unique over-represented repres-sor TFs together with other unique repressor TFs effect down-regulation of b-keratin genes in LF which is the barbs andbarbule-less rachis portion of feather Interestingly the Chr7b-keratin gene together with scale claw and few keratinocyteb-keratin genes showed up-regulation in EB and EF (Chr7logFC 248 claw logFC 187 scale logFC 137 keratino-cytes logFC 108 Plt 002) relative to those in LB MF andLF Most pTFs of the scale claw keratinocyte and Chr7 b-

keratin genes including enriched regulators of claw and scaleb-keratin genes differentially expressed with their targetgenes

Differential expression of a pTF gene with the target genesupported 262 (out of total 372) predicted TFndashtarget generelationships involving 56 of the 81 pTFs (figs 2B and 4)Thus compared with the overall prediction (fig 2A) wefound for scale 13 putative TFs (pTFs)regulate 17 targetgenes for claw 7 pTFs9 target genes for keratinocyte 16pTFs9 target genes and for feather 38 pTFs56 targetgenes Twenty-five TFs (9 activators and 16 repressors)that mostly regulate feather b-keratin genes althoughthey satisfied PCCgt j08j with their target gene did notsatisfy differential gene expression with their target geneBased on this robust rule we identified 56 TFs that regulate91 b-keratin genes We found that b-keratin genes on achromosome or in a sub-class that formed clades haveover-represented TFs Differential expression analysis fur-ther supported 16 of the 20 predicted over-representedTFs and provided additional 10 major TFs that either reg-ulate two cladessub-classes or a sub-clade of b-keratingenes within a clade skipped by enrichment analysis Themajor TFs predicted by enrichment analysis (for clades) andfurther supported by differential analysis are summarized intable 1 which includes 22 activators but only four repres-sors The major regulators predicted herein are describedwith their unique and common tentative roles and areimportant candidates for functional characterization of b-keratin genes

Table 1 Inferred major regulators of b-keratin genes with putative roles The listed TFs are the key regulators of b-keratin genes predicted by acombined approach of clade-wise enrichment and synchronous differential expression of TF-target genes

TF gene ID TF Family Activatorrepressor

Enriched Diffexpressed

No oftarget genes

Tentative role

ENSGALG00000007056 C2H2 ZF activator Yes Yes 5 Uniquely regulates Chr2 b-keratinsENSGALG00000011435 Ets activator Yes Yes 4ENSGALG00000016927 C2H2 ZF activator Yes Yes 4ENSGALG00000009774 Ets repressor Yes Yes 15 Uniquely regulates Chr27 b-keratinsENSGALG00000011980 Grainyhead repressor Yes Yes 12ENSGALG00000000233 C2H2 ZF activator Yes Yes 17 Commonly regulates Chr25 and Chr27 b-keratinsENSGALG00000010101 C2H2 ZF activator Yes Yes 21ENSGALG00000012277 bZIP activator Yes Yes 17ENSGALG00000008014 bZIP activator No Yes 13ENSGALG00000006210 bHLH activator No Yes 4ENSGALG00000011053 Homeodomain activator No Yes 10ENSGALG00000000836 RFX activator No Yes 6ENSGALG00000023607 bHLH activator Yes Yes 10 Uniquely regulates Chr25 b-keratinsENSGALG00000002393 AP-2 activator Yes Yes 13ENSGALG00000003404 activator No Yes 3ENSGALG00000011055 Homeodomain activator No Yes 3ENSGALG00000001143 Ets activator No Yes 1 Uniquely regulates Chr10 b-keratinsENSGALG00000023742 bHLH activator No Yes 1ENSGALG00000023738 Homeodomain activator Yes Yes 9 Commonly regulates scale claw Chr7 b-keratinsENSGALG00000023195 Homeodomain activator No Yes 6ENSGALG00000013342 Homeodomain activator Yes Yes 6 Uniquely regulates claw b-keratinsENSGALG00000009129 Homeodomain activator Yes Yes 4ENSGALG00000007357 C2H2 ZF repressor Yes Yes 10 Commonly regulates scale and claw b-keratinsENSGALG00000000316 Homeodomain activator Yes Yes 5 Uniquely regulates scale b-keratinsENSGALG00000008250 T-box activator Yes Yes 6ENSGALG00000003759 Nuclear receptor repressor No Yes 3

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2775

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 6: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

over-represented TFs of the Chr27 clade may be catego-rized as three unique (ENSGALG00000009774 ENSGALG00000011980 and ENSGALG00000008403) and twocommon (ENSGALG00000019647 and ENSGALG00000001577) with Chr25 This implies that the two over-represented TFs of the Chr27 clade common with Chr25mostly regulate Chr27 b-keratin genes along with muchfewer Chr25 b-keratin genes and therefore were notover-represented in the Chr25 clade Three TFs (all activa-tors) of one family (HD) and three TFs (one repressor andtwo activators) of three families (C2H2 ZF HD and T-box)were over-represented in the claw and scale b-keratinclades respectively Among them the repressor TF(ENSGALG00000007357) commonly regulates scale andclaw b-keratin genes As the repressor TF was over-represented in the scale clade it clearly implied that therepressors have more targets in scale b-keratin genes thanin claw b-keratin genes Overall the enrichment analysisprovided a set of major TFs narrowing down the relativelylarge network (supplementary table S8 SupplementaryMaterial online) to a few TFs (fig 3C)

Major TFs Predicted by Differential Expression of -Keratin Genes and Their pTFsWe conducted differential expression analysis of both b-ker-atin genes and their pTF genes (fig 4 also see supplementarytables S10 and S11 Supplementary Material online) to (1)cross check overall predicted TFndashtarget gene relationship (2)find additional major TFs regulating a small cluster of b-ker-atin genes of two clades or within a clade skipped by enrich-ment analysis and (3) understand the structural diversity offeather beyond the previous study (Ng et al 2014) We foundthat among the five different feather structures (EB LB EF LFand MF) Chr2 b-keratin genes were up-regulated in LFand MF (log fold change (FC) 259 Plt 002) due to theup-regulation of 12 of the 15 pTFs of Chr2 (including 3 of the4 over-represented TFs of the Chr2 clade) Similarly Chr25b-keratin genes were up-regulated in EB and LB (logFC 166 Plt 002) and Chr27 b-keratin genes were down-regulatedin LF (logFC258 Plt 002) due to up-regulation of somepTFs of Chr25 and down-regulation of some pTFs of Chr27 b-keratin genes respectively (4 of 9 unique pTF of Chr25 15 of25 unique pTF of Chr27 and 3 of 10 common pTFs of Chr25

FIG 4 Heatmap showing dynamic expression of b-keratin genes andtheir putative regulators The heatmap is based on an average expres-sion level (normalized) of the genes under each condition Low tohigh expression is shown in a gradient scale from blue to red (see thecolor bar at top) The entries on the top of the heatmap denote the

FIG 4 Continuedconditions from which the transcriptomes were obtained as detailedin figure S1 Supplementary Material online The entries in top middlethat interrupted the heatmap denote bndashkeratin genes on a chromo-some or in a subclass The extreme right entries denote bndashkeratins(green) corresponding to each chromosome or subclass (shown intop middle) and their corresponding TFs (black) As the expressiondifferences between TF and target genes differ in magnitude we usedindependent scales for them for better resolution of synchronousexpression TFs that regulate b-keratin genes of a particular subclass(scale claw or feather) along with few keratinocytes are consideredunder the particular subclass About 25 of the 81 pTFs did not showdifferential expression and are marked by ldquordquo symbol The over-rep-resented TFs are indicated by ldquordquo

Bhattacharjee et al doi101093molbevmsw165 MBE

2774

and Chr27 were not differentially expressed) While a fewof the Chr25 and Chr27 differentially expressed pTFs areunique (supplementary tables S8 and S11 SupplementaryMaterial online) others are common The differentiallyexpressed pTFs that commonly regulate Chr25 and Chr27b-keratin genes include three (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) of thefive over-represented TFs of the Chr25 clade but none ofthe Chr27 clade The unique differentially expressed TFs ofthe Chr25 clade include two over-represented TFs(ENSGALG00000023607 and ENSGALG00000002393) of theChr25 clade along with three other activators(ENSGALG00000003404 ENSGALG00000011055 andENSGALG00000023420) whereas the differentially expressedunique TFs of Chr27 b-keratin genes are mostly repressorsincluding two (ENSGALG00000009774 ENSGALG00000011980) of the four over-represented repressor TFs ofChr27 b-keratin genes This implies that for Chr25 b-keratingenes two unique over-represented TFs along with otherunique activators TFs effect up-regulation in EB and LB ForChr27 b-keratin genes two unique over-represented repres-sor TFs together with other unique repressor TFs effect down-regulation of b-keratin genes in LF which is the barbs andbarbule-less rachis portion of feather Interestingly the Chr7b-keratin gene together with scale claw and few keratinocyteb-keratin genes showed up-regulation in EB and EF (Chr7logFC 248 claw logFC 187 scale logFC 137 keratino-cytes logFC 108 Plt 002) relative to those in LB MF andLF Most pTFs of the scale claw keratinocyte and Chr7 b-

keratin genes including enriched regulators of claw and scaleb-keratin genes differentially expressed with their targetgenes

Differential expression of a pTF gene with the target genesupported 262 (out of total 372) predicted TFndashtarget generelationships involving 56 of the 81 pTFs (figs 2B and 4)Thus compared with the overall prediction (fig 2A) wefound for scale 13 putative TFs (pTFs)regulate 17 targetgenes for claw 7 pTFs9 target genes for keratinocyte 16pTFs9 target genes and for feather 38 pTFs56 targetgenes Twenty-five TFs (9 activators and 16 repressors)that mostly regulate feather b-keratin genes althoughthey satisfied PCCgt j08j with their target gene did notsatisfy differential gene expression with their target geneBased on this robust rule we identified 56 TFs that regulate91 b-keratin genes We found that b-keratin genes on achromosome or in a sub-class that formed clades haveover-represented TFs Differential expression analysis fur-ther supported 16 of the 20 predicted over-representedTFs and provided additional 10 major TFs that either reg-ulate two cladessub-classes or a sub-clade of b-keratingenes within a clade skipped by enrichment analysis Themajor TFs predicted by enrichment analysis (for clades) andfurther supported by differential analysis are summarized intable 1 which includes 22 activators but only four repres-sors The major regulators predicted herein are describedwith their unique and common tentative roles and areimportant candidates for functional characterization of b-keratin genes

Table 1 Inferred major regulators of b-keratin genes with putative roles The listed TFs are the key regulators of b-keratin genes predicted by acombined approach of clade-wise enrichment and synchronous differential expression of TF-target genes

TF gene ID TF Family Activatorrepressor

Enriched Diffexpressed

No oftarget genes

Tentative role

ENSGALG00000007056 C2H2 ZF activator Yes Yes 5 Uniquely regulates Chr2 b-keratinsENSGALG00000011435 Ets activator Yes Yes 4ENSGALG00000016927 C2H2 ZF activator Yes Yes 4ENSGALG00000009774 Ets repressor Yes Yes 15 Uniquely regulates Chr27 b-keratinsENSGALG00000011980 Grainyhead repressor Yes Yes 12ENSGALG00000000233 C2H2 ZF activator Yes Yes 17 Commonly regulates Chr25 and Chr27 b-keratinsENSGALG00000010101 C2H2 ZF activator Yes Yes 21ENSGALG00000012277 bZIP activator Yes Yes 17ENSGALG00000008014 bZIP activator No Yes 13ENSGALG00000006210 bHLH activator No Yes 4ENSGALG00000011053 Homeodomain activator No Yes 10ENSGALG00000000836 RFX activator No Yes 6ENSGALG00000023607 bHLH activator Yes Yes 10 Uniquely regulates Chr25 b-keratinsENSGALG00000002393 AP-2 activator Yes Yes 13ENSGALG00000003404 activator No Yes 3ENSGALG00000011055 Homeodomain activator No Yes 3ENSGALG00000001143 Ets activator No Yes 1 Uniquely regulates Chr10 b-keratinsENSGALG00000023742 bHLH activator No Yes 1ENSGALG00000023738 Homeodomain activator Yes Yes 9 Commonly regulates scale claw Chr7 b-keratinsENSGALG00000023195 Homeodomain activator No Yes 6ENSGALG00000013342 Homeodomain activator Yes Yes 6 Uniquely regulates claw b-keratinsENSGALG00000009129 Homeodomain activator Yes Yes 4ENSGALG00000007357 C2H2 ZF repressor Yes Yes 10 Commonly regulates scale and claw b-keratinsENSGALG00000000316 Homeodomain activator Yes Yes 5 Uniquely regulates scale b-keratinsENSGALG00000008250 T-box activator Yes Yes 6ENSGALG00000003759 Nuclear receptor repressor No Yes 3

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2775

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 7: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

and Chr27 were not differentially expressed) While a fewof the Chr25 and Chr27 differentially expressed pTFs areunique (supplementary tables S8 and S11 SupplementaryMaterial online) others are common The differentiallyexpressed pTFs that commonly regulate Chr25 and Chr27b-keratin genes include three (ENSGALG00000000233ENSGALG00000010101 and ENSGALG00000012277) of thefive over-represented TFs of the Chr25 clade but none ofthe Chr27 clade The unique differentially expressed TFs ofthe Chr25 clade include two over-represented TFs(ENSGALG00000023607 and ENSGALG00000002393) of theChr25 clade along with three other activators(ENSGALG00000003404 ENSGALG00000011055 andENSGALG00000023420) whereas the differentially expressedunique TFs of Chr27 b-keratin genes are mostly repressorsincluding two (ENSGALG00000009774 ENSGALG00000011980) of the four over-represented repressor TFs ofChr27 b-keratin genes This implies that for Chr25 b-keratingenes two unique over-represented TFs along with otherunique activators TFs effect up-regulation in EB and LB ForChr27 b-keratin genes two unique over-represented repres-sor TFs together with other unique repressor TFs effect down-regulation of b-keratin genes in LF which is the barbs andbarbule-less rachis portion of feather Interestingly the Chr7b-keratin gene together with scale claw and few keratinocyteb-keratin genes showed up-regulation in EB and EF (Chr7logFC 248 claw logFC 187 scale logFC 137 keratino-cytes logFC 108 Plt 002) relative to those in LB MF andLF Most pTFs of the scale claw keratinocyte and Chr7 b-

keratin genes including enriched regulators of claw and scaleb-keratin genes differentially expressed with their targetgenes

Differential expression of a pTF gene with the target genesupported 262 (out of total 372) predicted TFndashtarget generelationships involving 56 of the 81 pTFs (figs 2B and 4)Thus compared with the overall prediction (fig 2A) wefound for scale 13 putative TFs (pTFs)regulate 17 targetgenes for claw 7 pTFs9 target genes for keratinocyte 16pTFs9 target genes and for feather 38 pTFs56 targetgenes Twenty-five TFs (9 activators and 16 repressors)that mostly regulate feather b-keratin genes althoughthey satisfied PCCgt j08j with their target gene did notsatisfy differential gene expression with their target geneBased on this robust rule we identified 56 TFs that regulate91 b-keratin genes We found that b-keratin genes on achromosome or in a sub-class that formed clades haveover-represented TFs Differential expression analysis fur-ther supported 16 of the 20 predicted over-representedTFs and provided additional 10 major TFs that either reg-ulate two cladessub-classes or a sub-clade of b-keratingenes within a clade skipped by enrichment analysis Themajor TFs predicted by enrichment analysis (for clades) andfurther supported by differential analysis are summarized intable 1 which includes 22 activators but only four repres-sors The major regulators predicted herein are describedwith their unique and common tentative roles and areimportant candidates for functional characterization of b-keratin genes

Table 1 Inferred major regulators of b-keratin genes with putative roles The listed TFs are the key regulators of b-keratin genes predicted by acombined approach of clade-wise enrichment and synchronous differential expression of TF-target genes

TF gene ID TF Family Activatorrepressor

Enriched Diffexpressed

No oftarget genes

Tentative role

ENSGALG00000007056 C2H2 ZF activator Yes Yes 5 Uniquely regulates Chr2 b-keratinsENSGALG00000011435 Ets activator Yes Yes 4ENSGALG00000016927 C2H2 ZF activator Yes Yes 4ENSGALG00000009774 Ets repressor Yes Yes 15 Uniquely regulates Chr27 b-keratinsENSGALG00000011980 Grainyhead repressor Yes Yes 12ENSGALG00000000233 C2H2 ZF activator Yes Yes 17 Commonly regulates Chr25 and Chr27 b-keratinsENSGALG00000010101 C2H2 ZF activator Yes Yes 21ENSGALG00000012277 bZIP activator Yes Yes 17ENSGALG00000008014 bZIP activator No Yes 13ENSGALG00000006210 bHLH activator No Yes 4ENSGALG00000011053 Homeodomain activator No Yes 10ENSGALG00000000836 RFX activator No Yes 6ENSGALG00000023607 bHLH activator Yes Yes 10 Uniquely regulates Chr25 b-keratinsENSGALG00000002393 AP-2 activator Yes Yes 13ENSGALG00000003404 activator No Yes 3ENSGALG00000011055 Homeodomain activator No Yes 3ENSGALG00000001143 Ets activator No Yes 1 Uniquely regulates Chr10 b-keratinsENSGALG00000023742 bHLH activator No Yes 1ENSGALG00000023738 Homeodomain activator Yes Yes 9 Commonly regulates scale claw Chr7 b-keratinsENSGALG00000023195 Homeodomain activator No Yes 6ENSGALG00000013342 Homeodomain activator Yes Yes 6 Uniquely regulates claw b-keratinsENSGALG00000009129 Homeodomain activator Yes Yes 4ENSGALG00000007357 C2H2 ZF repressor Yes Yes 10 Commonly regulates scale and claw b-keratinsENSGALG00000000316 Homeodomain activator Yes Yes 5 Uniquely regulates scale b-keratinsENSGALG00000008250 T-box activator Yes Yes 6ENSGALG00000003759 Nuclear receptor repressor No Yes 3

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2775

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 8: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

Feather b-keratin genes on a chromosome tend to be re-cent tandem duplications or to be undergoing concertedevolution (eg gene conversion) Thus feather b-keratingenes on the same chromosome tend to be more similarin sequence and regulation than feather b-keratin genes ondifferent chromosomes This pattern of chromosome-wisedifferential expression of feather b-keratins is likely a major

cause for structural differences among different parts of feath-ers The up-regulation of Chr2 and Chr25 b-keratin genesin MF and LF of flight feather and in EB and LB of contourfeather respectively was noted in a previous study (Ng et al2014) We additionally found Chr27 b-keratin genes weredown-regulated in barb and barbule-less rachis portion (LF)of flight feather Most interestingly we found that ancestralscale and claw b-keratin genes together with the basal Chr7feather b-keratin gene were significantly up-regulated in EBand EF As the paleontological record from nonavian dino-saurs strongly suggested that pennaceous feather arose priorto flight (Xu et al 2003 2011 Hu et al 2009) EB might be anearly feather structure evolved The other parts of feathermight have evolved by expansion and divergence of b-keratingenes on different chromosomes with the recruitment ofspecific regulators As EF is more similar to EB we assumeflight feather a modified contour feather with the distal por-tions ie MF and LF modified whereas EF remains moresimilar to the EB of contour feather

Experimental Validation of Predicted TFndashTarget GeneRelationshipsWe used Electrophoretic Mobility Shift Assay (EMSA) to ver-ify the predicted TFndashtarget gene relationships for 7 TFs and 14cognate TFBSs (from target genes) (supplementary table S12Supplementary Material online) Nine representative casesare shown in figure 5 In each case in the absence of thecognate TF protein only the fast-migrating free biotin-labeled TFBS probe was observed When the purified TF pro-tein was included in the binding reaction a strong TFndashTFBScomplex was observed as a slowly migrating complex indi-cating interaction between the TF protein and the corre-sponding promoter sequence from the target gene Whenpurified TF protein along with biotin-labeled and excess un-labeled probe was included in the binding reaction very lightor no TFndashTFBS complex was observed implicating compet-itive binding of the excessive un-labeled probe with the TFprotein in comparison to the labeled probe When a purifiedTF along with a biotin-labeled mutated probe (mutationdone in core TFBS positions) was included in the bindingreaction no shifted TFndashTFBS complex was observed indicat-ing no interaction between the TF protein and the mutatedprobe The selected seven TFs and their cognate TFBSs wererepresentatives from each of the predicted categories havingunique role including the important regulatory case (Iroquoishomeobox 6 IRX 6) connecting ancestral scale and claw b-keratins with basal feather b-keratin (Chr7 ENSGALG00000006133) Our EMSA tests demonstrated the reliability ofour annotated b-keratin gene regulatory network

Concluding RemarksGene duplication provides raw materials for the evolution ofnew functions or traits (Ohno 1970) However how the reg-ulatory machinery changes along with the evolution of du-plicated genes remains largely unknown In this study wehave characterized regulatory changes in duplicated b-keratingenes We observed a b-keratin gene on Chr7 that was

FIG 5 Assays of the binding of a transcription factor to the cis-ele-ments in its target genes by EMSA (AndashI) TF proteins purified fromEscherichia coli were incubated with wild (TFBS) and mutated(mTFBS) cis-element probes in the target gene for EMSA EMSAwas conducted with the labeled probe alone (lane a) or with com-bined purified TF and the labeled probe (lane b) Combined purifiedTF with a labeled probe and excess of an un-labeled (10-fold) probe(lane c) showed quantitative competition of probes for bindingCombined purified TF and a mutated probe (lane d) confirmed bind-ing specificity of TF ldquoTargetrdquo denotes the gene with the predictedTFBS in its promoter ldquoPWMrdquo denotes the sequence logo of the po-sition weight matrix for the binding-site motifs

Bhattacharjee et al doi101093molbevmsw165 MBE

2776

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 9: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

originally derived from an ancestral scale or claw b-keratingene likely located on Chr25 but has evolved to be a featherb-keratin This ancestral b-keratin on Chr7 was duplicatedto other chromosomes Subsequently there were more du-plications or translocations of feather b-keratin genes to sev-eral other chromosomes (Greenwold et al 2014 Ng et al2014) including Chr25 While the Chr7 b-keratin gene stillshares a TF with scale and claw b-keratin genes the b-keratingenes in other chromosomes have recruited distinct TFs anddiverged in regulation In total we identified 81 pTFs whichregulate 91 b-keratin genes Among the 81 pTFs 56 TFsshowed differential expression with their target genesFinally 26 TFs were identified as major regulators each reg-ulating multiple b-keratin genes Our data showed that theup- and down-regulation of a b-keratin gene in a featherregion correlates well with the up- and down-regulation ofits putative TF genes From these observations we proposethat the regulatory divergence among b-keratin genesis largely responsible for the diversification of feathers in dif-ferent body parts of a bird

Materials and Methods

Transcriptome Analysis and Construction of GeneCo-Expression ModulesWe used the transcriptome data obtained previously fromtwo developing conditions of body contour feather andthree developing conditions of flight feather (Ng et al2014) In total 15 transcriptomes data were available for 5developing conditions of feather each condition has 3 bi-ological replicates (supplementary fig S1 SupplementaryMaterial online) The two parts of contour feather weredefined as early-body (EB) and late-body (LB) whereasthe three parts of flight feather were defined by early-flight (EF) middle-flight (MF) and late-flight (LF) The tran-scriptomes from these five feather regions were re-analyzedusing the current chicken genome assembly (Gallus gallusgalGal4 Ensembl release 77) and gene annotation (GallusgallusgalGal477gtf) the previous analysis was based onrelease 72 For each transcriptome low-quality reads weretrimmed and adapters were removed using Trimmomaticversion 032 (Bolger et al 2014) The processed paired endreads were mapped to the chicken genome using Tophatversion 2013 (Trapnell et al 2009) allowing mis-matchesfrac14 2 maximum multi-hitsfrac14 5 and Gallus gallusgalGal477gtf as the reference gene annotation The missingb-keratin genes in the previous annotation (Ng et al 2014)were manually updated in Gallus gallusgalGal477gtfMoreover mapping of reads was also allowed in novel re-gions beyond the transcriptome-index fetched from sup-plied Ensembl gene annotation Gallus gallusgalGal477gtfThe alignment quality and distribution of the reads wereestimated using SAM tools version 200 (Li et al 2009)From the aligned reads the transcripts were assembled byreference guided (RABT) policy using Cufflinks version 211(Trapnell et al 2013) In addition de novo transcript assem-bly was also performed to cross validate RABT assembledtranscripts The gene expression level was calculated in

terms of fragments per kilobase of exon per million mappedreads (FPKM) For FPKM calculation uniquely mappedreads were first assigned to the genes Multiple-hit readswere then redistributed to genes based on their relativeabundance of uniquely mapped reads

Genes with FPKM 1 in at least one of the conditionswere considered ldquoexpressedrdquo and selected for further analysisTo compare expression of the selected genes across the con-ditions we applied the upper quartile normalization proce-dure (Bullard et al 2010) Early flight replicate 3 (EF3)transcriptome was used as the reference for normalizationbecause its FPKM values were most evenly distributed amongthe 15 transcriptomes

The Pearson correlation coefficient (PCC) between theexpression profiles of the genes was used as the distancemetric for gene expression differences To obtain b-keratinassociated co-expression cluster we applied two steps Firstwe extracted a sub-set of b-keratin and associated genes thathave PCC 08 with the b-keratin genes Second the genes inthe sub-set were clustered based on the expression profiles bythe hierarchical clustering-complete linkage distance policyusing heatmap2 function in the ldquogplotsrdquo package of R(Warnes et al 2015) The hierarchical cluster was cleaved toget clustersgroups of co-expressed gene using the ldquohclustrdquofunction of R (R development core team httpswwwr-projectorg version 311) (last assessed on April 17 2016)Because co-expression does not always imply co-regulationwe added the condition that the genes in a cluster shouldbelong to the same functional category (same GO term) TheGene Ontology (GO) annotation of the chicken genome wasdownloaded from Ensembl (release 77) via BioMart (Smedleyet al 2015) We performed Fisherrsquos exact test (Benjamini andHochberg 1995) to examine the functional categories thatenriched in different clusters (P valuelt 005 FDRlt 005)For a cluster that may more likely be related to keratin weadded the condition that the genes in a cluster should havethe keratin-related functional terms (keratin feather keratinclaw keratin scale keratin keratinocyte intermediate fila-ment structural constituent of cytoskeleton) with the lowestP value and FDR in Fisherrsquos exact test A cluster of co-expressed genes with significant functional enrichment wastermed a co-expressed module

Discovering TF Binding Motifs and InferringTFndashTarget Gene RelationshipsThe putative promoter region was defined as the regionfrom the transcription start sites (TSS) or if unavailablethe translation start site to its upstream 1000 bp as mostof the regulatory motifs are concentrated in this region Toidentify whether a TFBS exists in the promoter of a gene andto infer TFndashtarget gene relationship our method includesthree steps (supplementary figs S4 and S5 SupplementaryMaterial online)

First for motif discovery we used the known or predictedTFndashTFBS pairs of Gallus gallus from the CIS-BP database(cisbpccbrutorontoca v101) (last assessed on April 142016) (Weirauch et al 2014) Each TFBS in the form of aposition weight matrix was used to scan the promoter region

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2777

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 10: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

of a set of strongly coexpressed genes (obtained by thepreceding method) using FIMO (PMID 21330290) in theMEME suite with P valuelt 1e-4 We assumed that genesin a co-expressed module with significant functional en-richment are co-regulated at the transcriptional level andtheir promoter regions might contain common bindingmotifs for the TF For the motif enrichment analysis ineach coexpressed module Fisherrsquos exact test was used(FDRlt 1e-5)

Second the TFs corresponding to the enriched motifswere extracted from the TFndashTFBS pairs of Gallus gallus inthe CIS-BP database To verify an over-represented motiffound in the promoter of a gene being true the genecoding the TF protein was required to be correlated(PCCgt j08j) with the gene whose promoter has theTFBS of the TF If the above condition was satisfied wecalled the TFTFBS as putative TFTFBS (pTFpTFBS) Insummary the first two steps provided putative TFndashtargetrelationship of b-keratin and some other genes and pu-tative TFndashTFBS interactions

Third the conservation of a putative motif (TFBS) wasassessed by finding the homologous sequences of chickenb-keratin genes in Turkey (Meleagris gallopavo) For this weannotated the b-keratin genes in Turkey via a search methodsimilar to Greenwold et al (2014) but with some modifica-tions First of all the chicken protein sequences encoded by allof the expressed b-keratin genes were used as queries tosearch against the Turkey genome (release 50) byTBLASTN (Altschul et al 1990 Mount 2007) and the hitswith similaritygt 70 (e valuelt 10 5) were selected Foreach genomic region with significant hits redundant hitsthat completely coincided with other hits were filtered outFor the remaining hits the nucleotide sequences of thealigned region together with 100 bp upstream and down-stream to it were retrieved using Galaxy (Goecks et al2010) and GeneWise (Birney et al 2004) was used to verifythe gene structure Regions that contained an early stop co-don (lt 80 amino acids) were considered as pseudogenes b-keratin genes express and functions as clusters and thereforeseveral groups have common regulation and over-represented motifs This property is useful to assess conser-vation of a b-keratin motif by finding a group of orthologous(homologous) b-keratin genes within another species ratherthan assessing the conservation among several speciesTherefore we find homologous relationships of chicken b-keratin genes in Turkey using BLASTP with chicken b-keratinsas ldquoqueriesrdquo and Turkey b-keratins as ldquosubjectsrdquo For eachchicken b-keratin hits with similaritygt 70 (E valuelt1e-10) were considered significant Almost 90 of Turkey b-ker-atins showed significant hits with E valuelt1e-25 and wechoose the top five (based on lowest E value) Turkey b-keratingenes that showed similarity with a chicken b-keratin geneThe conservation of a TFBS was tested by aligning the chickenb-keratin promoter and its five homologous promoters ofTurkey b-keratin genes using MUSCLE version 3831 (Edgar2004) and finding whether the TFBS also appeared (FIMO Pvaluelt 1e-4) in at least one of them on the same strandwithin a distance of 200 bp

The pTFndashtarget gene relationship was visualized with thesoftware Cytoscape (Shannon et al 2003) and categorized togroups based on a set of genes having common specific reg-ulators distinct from other groups

The phylogenetic analysis of the b-keratin genes was doneusing the maximum likelihood method in MEGA-7 (Kumaret al 2016) The best model for phylogentic analysis wasfound through the model test option of MEGA-7 usingAkaike and Bayesian information criteria (AIC and BIC)The phylogenetic tree was then compared with pTFndashtargetgene categories to understand regulatory transition associ-ated with the evolution of b-keratin genes The major puta-tive regulators were found through enrichment analysis of theTFs among the pTFndashtarget gene category to which it be-longed To further confirm our prediction we did differentialexpression analysis of the pTF gene and their target gene atdifferent developing condition of the feather using the Rpackage edgeR (Robinson et al 2010 McCarthy et al 2012)We consider the pTFndashtarget gene relationship true if bothshow differential expression at the same time

EMSA Assay to Validate TFndashTFBS BindingEMSA was conducted to validate the predicted TFndashTFBSrelationship For the production of a recombinant TF pro-tein and the subsequent EMSA experiment we followed theprotocol in a previous study (Yu et al 2015) The primersused for TF amplificationcloning and production of TFBSprobe is given in supplementary table S13 SupplementaryMaterial online (sheets 1 and 2) For the EMSA assay be-tween a TF and the TFBS present in the promoter of theputative target gene we designed three types of probes Thefirst two types are wild type biotin-labeled and un-labeledprobes respectively The third type is the biotin-labeledmutated probe designed by modifying conserved nucleo-tides using the position weight matrix of the TFBS (corre-sponding to a particular TF) as the reference The EMSAmixtures of TF and TFBS were separated by 375 nativepolyacrylamide gel transferred to Hybond Nthornmembranesand visualized by the BioSpectrum imaging system (UVP)

Supplementary MaterialSupplementary tables S1ndashS13 and figures S1ndashS6 are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsThis study was supported by the Ministry of Science andTechnology of Taiwan (MOST 104-2621-B-001-003-MY3)MJB was supported by postdoctoral fellowship ofAcademia Sinica MJB and WHL conceived this studyMJB CPY JJL and WHL designed the experiments andperformed computational analysis MJB carried out theEMSA experiments CSN TYW and HHL offered technicalconsultation and professional suggestions MJB WHL andCSN wrote the article All authors read and revised thearticle

Bhattacharjee et al doi101093molbevmsw165 MBE

2778

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 11: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

ReferencesAlibardi L Dalla Valle L Nardi A Toni M 2009 Evolution of hard proteins

in the sauropsid integument in relation to the cornification of skinderivatives in amniotes J Anat 214560ndash586

Alibardi L Toni M 2008 Cytochemical and molecular characteristics ofthe process of cornification during feather morphogenesis ProgHistochem Cytochem 431ndash69

Altschul SF Gish W Miller W Myers EW Lipman DJ 1990 Basic localalignment search tool J Mol Biol 215403ndash410

Benjamini Y Hochberg Y 1995 Controlling the false discovery rate ndash apractical and powerful approach to multiple testing J R Stat Soc BMethodol 57289ndash300

Birney E Clamp M Durbin R 2004 GeneWise and GenomewiseGenome Res 14988ndash995

Bolger AM Lohse M Usadel B 2014 Trimmomatic a flexible trimmer forIllumina sequence data Bioinformatics 302114ndash2120

Bullard JH Purdom E Hansen KD Dudoit S 2010 Evaluation of statisticalmethods for normalization and differential expression in mRNA-Seqexperiments BMC Bioinformatics 1194

Dalla Valle L Nardi A Belvedere P Toni M Alibardi L 2007 Beta-keratinsof differentiating epidermis of snake comprise glycine-proline-serine-rich proteins with an avian-like gene organization Dev Dyn2361939ndash1953

Dalla Valle L Nardi A Toffolo V Niero C Toni M Alibardi L 2007Cloning and characterization of scale beta-keratins in the differenti-ating epidermis of geckoes show they are glycine-proline-serine-richproteins with a central motif homologous to avian beta-keratinsDev Dyn 236374ndash388

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Goecks J Nekrutenko A Taylor J Galaxy T 2010 Galaxy a compre-hensive approach for supporting accessible reproducible andtransparent computational research in the life sciencesGenome Biol 11R86

Greenwold MJ Bao W Jarvis ED Hu H Li C Gilbert MT Zhang G SawyerRH 2014 Dynamic evolution of the alpha (alpha) and beta (beta)keratins has accompanied integument diversification and the adap-tation of birds into novel lifestyles BMC Evol Biol 14249

Greenwold MJ Sawyer RH 2010 Genomic organization and molecularphylogenies of the beta (beta) keratin multigene family in thechicken (Gallus gallus) and zebra finch (Taeniopygia guttata) impli-cations for feather evolution BMC Evol Biol 10148

Greenwold MJ Sawyer RH 2013 Molecular evolution and expression ofarchosaurian beta-keratins diversification and expansion of archo-saurian beta-keratins and the origin of feather beta-keratins J ExpZool B Mol Dev Evol 320393ndash405

Herrmann H Strelkov SV Burkhard P Aebi U 2009 Intermediate fila-ments primary determinants of cell architecture and plasticity J ClinInvest 1191772ndash1783

Hu D Hou L Zhang L Xu X 2009 A pre-Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus Nature461640ndash643

Huynh KD Fischle W Verdin E Bardwell VJ 2000 BCoR a novel core-pressor involved in BCL-6 repression Genes Dev 141810ndash1823

Kreplak L Doucet J Dumas P Briki F 2004 New aspects of the alpha-helix to beta-sheet transition in stretched hard alpha-keratin fibersBiophys J 87640ndash647

Kumar S Stecher G Tamura K 2016 MEGA7 molecular evolutionarygenetics analysis version 70 for bigger datasets Mol Biol Evol331870ndash1874

Lee CH Kim MS Chung BM Leahy DJ Coulombe PA 2012 Structuralbasis for heteromeric assembly and perinuclear organization of ker-atin filaments Nat Struct Mol Biol 19707ndash715

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079

Li YI Kong L Ponting CP Haerty W 2013 Rapid evolution of Beta-keratin genes contribute to phenotypic differences that

distinguish turtles and birds from other reptiles Genome BiolEvol 5923ndash933

McCarthy DJ Chen Y Smyth GK 2012 Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological varia-tion Nucleic Acids Res 404288ndash4297

Mount DW 2007 Using the Basic Local Alignment Search Tool (BLAST)CSH Protoc 2007pdb top17

Ng CS Wu P Fan WL Yan J Chen CK Lai YT Wu SM Mao CT Chen JJLu MY et al 2014 Genomic organization transcriptomic analysisand functional characterization of avian alpha- and beta-keratins indiverse feather forms Genome Biol Evol 62258ndash2273

Ogura K Matsumoto K Kuroiwa A Isobe T Otoguro T Jurecic V BaldiniA Matsuda Y Ogura T 2001 Cloning and chromosome mapping ofhuman and chicken Iroquois (IRX) genes Cytogenet Cell Genet92320ndash325

Ohno S 1970 Evolution by gene duplication Berlin (Germany)Springer-Verlag

Pabisch S Puchegger S Kirchner HO Weiss IM Peterlik H 2010 Keratinhomogeneity in the tail feathers of Pavo cristatus and Pavo cristatusmut alba J Struct Biol 172270ndash275

Pettingill OS Breckenridge WJ 2012 Ornithology in laboratory and fieldOrlando Academic press Elsevier Science

Presland RB Whitbread LA Rogers GE 1989 Avian keratin genes IIChromosomal arrangement and close linkage of three gene familiesJ Mol Biol 209561ndash576

Prum RO 1999 Development and evolutionary origin of feathers J ExpZool 285291ndash306

Prum RO Brush AH 2002 The evolutionary origin and diversification offeathers Q Rev Biol 77261ndash295

Prum RO Brush AH 2003 Which came first the feather or the bird SciAm 28884ndash93

Regal PJ 1975 The evolutionary origin of feathers Q Rev Biol 5035ndash66Robinson MD McCarthy DJ Smyth GK 2010 edgeR a Bioconductor

package for differential expression analysis of digital gene expressiondata Bioinformatics 26139ndash140

Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin NSchwikowski B Ideker T 2003 Cytoscape a software environmentfor integrated models of biomolecular interaction networks GenomeRes 132498ndash2504

Smedley D Haider S Durinck S Pandini L Provero P Allen J Arnaiz OAwedh MH Baldock R Barbiera G et al 2015 The BioMart com-munity portal an innovative alternative to large centralized datarepositories Nucleic Acids Res 43W589ndashW598

Trapnell C Hendrickson DG Sauvageau M Goff L Rinn JL Pachter L2013 Differential analysis of gene regulation at transcript resolutionwith RNA-seq Nat Biotechnol 3146ndash53

Trapnell C Pachter L Salzberg SL 2009 TopHat discovering splice junc-tions with RNA-Seq Bioinformatics 251105ndash1111

Vandebergh W Bossuyt F 2012 Radiation and functional diversificationof alpha keratins during early vertebrate evolution Mol Biol Evol29995ndash1004

Velinov M Ahmad A Brown-Kipphut B Shafiq M Blau J Cooma R RothP Iqbal MA 2012 A 07 Mb de novo duplication at 7q213 includingthe genes DLX5 and DLX6 in a patient with split-handsplit-footmalformation Am J Med Genet A 158A3201ndash3206

Warnes GR Bolker B Bonebakker L Gentleman R Liaw WHA Lumley TMaechle M Magnusson A Moelle S Schwartz M Venables B 2015Various R programming tools for plotting data R Package Versiongplots21707

Weirauch MT Yang A Albu M Cote AG Montenegro-Montero ADrewe P Najafabadi HS Lambert SA Mann I Cook K et al 2014Determination and inference of eukaryotic transcription factor se-quence specificity Cell 1581431ndash1443

Weiss IM Schmitt KP Kirchner HO 2011 The peacockrsquos train (Pavocristatus and Pavo cristatus mut alba) II The molecular parametersof feather keratin plasticity J Exp Zool A Ecol Genet Physiol315266ndash273

Wu P Ng CS Yan J Lai YC Chen CK Lai YT Wu SM Chen JJ Luo WWidelitz RB et al 2015 Topographical mapping of alpha- and beta-

Regulatory Divergence among Beta-Keratin Genes during Bird Evolution doi101093molbevmsw165 MBE

2779

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780

Page 12: Regulatory Divergence among Beta-Keratin Genes …mcb.life.nthu.edu.tw/files/writing_journal/38358/1542_0c...Regulatory Divergence among Beta-Keratin Genes during Bird Evolution Maloyjo

keratins on developing chicken skin integuments functional inter-action and evolutionary perspectives Proc Natl Acad Sci USA112E6770ndashE6779

Xiao L Wang J Li J Chen X Xu P Sun S He D Cong Y Zhai Y 2015RORalpha inhibits adipocyte-conditioned medium-induced colorec-tal cancer cell proliferation and migration and chick embryo cho-rioallantoic membrane angiopoiesis Am J Physiol Cell Physiol308C385ndashC396

Xu X Wang K Zhang K Ma Q Xing L Sullivan C Hu D Cheng S Wang S2012 A gigantic feathered dinosaur from the lower cretaceous ofChina Nature 48492ndash95

Xu X You H Du K Han F 2011 An Archaeopteryx-like theropod fromChina and the origin of Avialae Nature 475465ndash470

Xu X Zheng X Sullivan C Wang X Xing L Wang Y Zhang X OrsquoConnorJK Zhang F Pan Y 2015 A bizarre Jurassic maniraptoran theropodwith preserved evidence of membranous wings Nature 52170ndash73

Xu X Zhou Z Wang X Kuang X Zhang F Du X 2003 Four-wingeddinosaurs from China Nature 421335ndash340

Yu CP Chen SC Chang YM Liu WY Lin HH Lin JJ Chen HJ Lu YJWu YH Lu MY et al 2015 Transcriptome dynamics of devel-oping maize leaves and genomewide prediction of cis elementsand their cognate transcription factors Proc Natl Acad Sci USA112E2477ndashE2486

Zhang F Zhou Z 2000 A primitive enantiornithine bird and the origin offeathers Science 2901955ndash1959

Zhang G Li C Li Q Li B Larkin DM Lee C Storz JF Antunes AGreenwold MJ Meredith RW et al 2014 Comparative genomicsreveals insights into avian genome evolution and adaptation Science3461311ndash1320

Zhang HM Chen H Liu W Liu H Gong J Wang H Guo AY 2012AnimalTFDB a comprehensive animal transcription factor databaseNucleic Acids Res 40D144ndashD149

Bhattacharjee et al doi101093molbevmsw165 MBE

2780