13
Molecular Evolution and Expression of Archosaurian bKeratins: Diversication and Expansion of Archosaurian bKeratins and the Origin of Feather bKeratins MATTHEW J. GREENWOLD* AND ROGER H. SAWYER Department of Biological Sciences, University of South Carolina, Columbia, South Carolina Recently the sequencing of three crocodilian genomes (Alligator mississippiensis, Crocodylus porosus, and Gavial gharial) was announced (St. John et al., 2012). These crocodilian genomes will enable researchers to better understand the basal archosaurian state and how modern birds and crocodilians have evolved since the split of these phenotypically divergent lineages (St. John et al., 2012). Genomic comparisons can be used to understand the overall molecular evolution of organisms and they can be used, ABSTRACT The archosauria consist of two living groups, crocodilians, and birds. Here we compare the structure, expression, and phylogeny of the beta (b)keratins in two crocodilian genomes and two avian genomes to gain a better understanding of the evolutionary origin of the feather bkeratins. Unlike squamates such as the green anole with 40 bkeratins in its genome, the chicken and zebra nch genomes have over 100 bkeratin genes in their genomes, while the American alligator has 20 bkeratin genes, and the saltwater crocodile has 21 bkeratin genes. The crocodilian bkeratins are similar to those of birds and these structural proteins have a central lament domain and Nand Ctermini, which contribute to the matrix material between the twisted bsheets, which form the 23 nm lament. Overall the expression of alligator bkeratin genes in the integument increases during development. Phylogenetic analysis demonstrates that a crocodilian bkeratin clade forms a monophyletic group with the avian scale and feather bkeratins, suggesting that avian scale and feather bkeratins along with a subset of crocodilian bkeratins evolved from a common ancestral gene/s. Overall, our analyses support the view that the epidermal appendages of basal archosaurs used a diverse array of bkeratins, which evolved into crocodilian and avian specic clades. In birds, the scale and feather subfamilies appear to have evolved independently in the avian lineage from a subset of archosaurian claw bkeratins. The expansion of the avian specic feather bkeratin genes accompanied the diversication of birds and the evolution of feathers. J. Exp. Zool. (Mol. Dev. Evol.) 9999B: 113, 2013. © 2013 Wiley Periodicals, Inc. How to cite this article: Greenwold MJ, Sawyer RH. 2013. Molecular evolution and expression of archosaurian bkeratins: Diversication and expansion of archosaurian bkeratins and the origin of feather bkeratins. J. Exp. Zool. (Mol. Dev. Evol.) 9999:113. J. Exp. Zool. (Mol. Dev. Evol.) 9999B:113, 2013 Conflicts of interest: None. Correspondence to: Matthew J. Greenwold, Department of Biological Sciences, University of South Carolina, Columbia, SC 29205. Email: [email protected] Received 21 December 2012; Revised 25 April 2013; Accepted 4 May 2013 DOI: 10.1002/jez.b.22514 Published online XX Month Year in Wiley Online Library (wileyonlinelibrary.com). RESEARCH ARTICLE © 2013 WILEY PERIODICALS, INC.

Molecular evolution and expression of archosaurian β-keratins: Diversification and expansion of archosaurian β-keratins and the origin of feather β-keratins

Embed Size (px)

Citation preview

Molecular Evolution andExpression of Archosaurianb‐Keratins: Diversification andExpansion of Archosaurianb‐Keratins and the Origin ofFeather b‐KeratinsMATTHEW J. GREENWOLD*AND ROGER H. SAWYERDepartment of Biological Sciences, University of South Carolina, Columbia, South Carolina

Recently the sequencing of three crocodilian genomes (Alligatormississippiensis, Crocodylus porosus, and Gavial gharial) wasannounced (St. John et al., 2012). These crocodilian genomes willenable researchers to better understand the basal archosaurianstate and how modern birds and crocodilians have evolved sincethe split of these phenotypically divergent lineages (St. Johnet al., 2012). Genomic comparisons can be used to understand theoverall molecular evolution of organisms and they can be used,

ABSTRACT The archosauria consist of two living groups, crocodilians, and birds. Here we compare the structure,expression, and phylogeny of the beta (b)‐keratins in two crocodilian genomes and two aviangenomes to gain a better understanding of the evolutionary origin of the feather b‐keratins. Unlikesquamates such as the green anole with 40 b‐keratins in its genome, the chicken and zebra finchgenomes have over 100 b‐keratin genes in their genomes, while the American alligator has 20 b‐keratin genes, and the saltwater crocodile has 21 b‐keratin genes. The crocodilian b‐keratins aresimilar to those of birds and these structural proteins have a central filament domain and N‐ and C‐termini, which contribute to the matrix material between the twisted b‐sheets, which form the 2–3 nm filament. Overall the expression of alligator b‐keratin genes in the integument increasesduring development. Phylogenetic analysis demonstrates that a crocodilian b‐keratin clade forms amonophyletic group with the avian scale and feather b‐keratins, suggesting that avian scale andfeather b‐keratins along with a subset of crocodilian b‐keratins evolved from a common ancestralgene/s. Overall, our analyses support the view that the epidermal appendages of basal archosaursused a diverse array of b‐keratins, which evolved into crocodilian and avian specific clades. In birds,the scale and feather subfamilies appear to have evolved independently in the avian lineage from asubset of archosaurian clawb‐keratins. The expansion of the avian specific feather b‐keratin genesaccompanied the diversification of birds and the evolution of feathers. J. Exp. Zool. (Mol. Dev. Evol.)9999B: 1–13, 2013. © 2013 Wiley Periodicals, Inc.

How to cite this article: Greenwold MJ, Sawyer RH. 2013. Molecular evolution and expression ofarchosaurian b‐keratins: Diversification and expansion of archosaurian b‐keratins and theorigin of feather b‐keratins. J. Exp. Zool. (Mol. Dev. Evol.) 9999:1–13.

J. Exp. Zool.(Mol. Dev. Evol.)9999B:1–13, 2013

Conflicts of interest: None.�Correspondence to: Matthew J. Greenwold, Department of Biological

Sciences, University of South Carolina, Columbia, SC 29205.E‐mail: [email protected]

Received 21 December 2012; Revised 25 April 2013; Accepted 4 May2013

DOI: 10.1002/jez.b.22514Published online XX Month Year in Wiley Online Library

(wileyonlinelibrary.com).

RESEARCH ARTICLE

© 2013 WILEY PERIODICALS, INC.

more specifically, to understand the molecular evolution ofspecific protein families when applied across multiple genomes(Zhang and Edwards, 2012). These comparisons may also provideinformation about the relationships between multigene familiesand the evolution of lineage‐specific phenotypes.The b‐keratin multigene family is found solely in reptiles and

birds (Gregg and Rogers, '86; Sawyer et al., 2000; Alibardi andSawyer, 2002; Sawyer et al., 2005; Fraser and Parry, 2011a), andthe sequencing of any reptilian or avian genome allows greaterinsight into the evolution of this family of structural proteins.Many of the cornified epidermal appendages of reptiles and birds(claws, scales, beaks, and feathers) are constructed of theseproteins, which function in the cornification process (formingfilaments) as do the epidermal a‐keratins, a subgroup ofintermediate filaments (Moll et al., '82; Fraser and Parry, 2012).Recent studies suggest that b‐keratins should be renamed KeratinAssociated Beta Proteins (KAbPs) (Toni et al., 2007; Alibardi andToni, 2008; Dalla Valle et al., 2010).b‐Keratins differ from a‐keratins in their molecular sequence,

gene and protein structure, and their presence among vertebrates(O'Guin et al., '82; Alibardi and Sawyer, 2002). Likewise the b‐

keratins differ from the mammalian Keratin Associated Proteins(KAPs) and KAPs are not found in the genomes of reptiles or birds(Wu et al., 2008; Vandebergh and Bossuyt, 2012). At the presenttime there is limited chemical, biophysical or ultrafine structuraldata on how the b‐keratins associate with the a‐keratins inreptiles or birds; as demonstrated for the a‐keratins and KAPs ofmammals (Marshall et al., '91; Powell et al., '91; Powell andRogers, '97). Here we use the term b‐keratin as presently used byother research groups (Fraser and MacRae, '76; Preslandet al., '89a,b; International Chicken Genome SequencingConsortium, 2004; Fraser and Parry, 2008, 2011a,b; Prumet al., 2009; D'Alba et al., 2011; Li et al., 2013).a‐Keratins form a‐helical filaments of 8–10 nm, while b‐

keratins with a diameter of 2–3 nm have a filament–matrixstructure where the central filamentous domain is �34 aminoacids in length forming 5 b‐strands with intervening turns (Fraserand MacRae, '76; Fraser and Parry, 2011b). The N and C‐terminaldomains of b‐keratins form the matrix and their sequences andlengths vary from species to species and among appendages(Fraser and Parry, 2008, 2011a). Examining b‐keratin sequencesfrom bird feathers, scale, and claws; Nile crocodile scale; turtle,lizard, and snake scales, Fraser and Parry (2011a) found that the N‐terminal domain of these b‐keratins is highly conserved in lengthand amino acid composition among archosaurs (turtles, crocodi-lians, and birds), but varies greatly among squamates. Whilespecialized mammalian epidermal appendages require 8–10 nma‐keratin filaments and independent matrix molecules (KAPs) toform the filament–matrix structure of corneousmaterial (Marshallet al., '91; Powell et al., '91; Powell and Rogers, '97), the corneousmaterial of avian scales and feathers form beta fibrils (betapackets) composed of 2–3 nm filaments surrounded by a matrix

composed of the N‐ and C‐termini of the b‐keratin molecule(Fraser and Parry, 2008, 2011a). Cornification, using only a‐

keratins, also occurs in some epidermal appendages of reptiles andbirds, but the filament–matrix structure is unclear (Baden andMaderson, '70; O'Guin and Sawyer, '82).The number of b‐keratins found in reptilian and avian

genomes varies from 37 in the green sea turtle and 40 in thegreen anole to well over a hundred in both the chicken and zebrafinch genomes (International Chicken Genome SequencingConsortium, 2004; Dalla Valle et al., 2010; Greenwold andSawyer, 2010; Warren et al., 2010; Alföldi et al., 2011; Liet al., 2013). The b‐keratin diversity in crocodilians appears tobe limited in both structure and number (Dalla Valle et al., 2009a;Ye et al., 2010). In the chicken and zebra finch, each specieshas four major b‐keratin subfamilies (claw, scale, feather,and feather‐like), which are monophyletic and form a clusteron microchromosome 25 (Greenwold and Sawyer, 2010).Feather b‐keratins are located at several chromosomal loci andcan be subdivided into multiple phylogenetic clades, which areassociated with the different genomic loci (Greenwold andSawyer, 2010). Recently, Li et al. (2013) found that the b‐keratincopy number varied from 37 to 89 in three turtle genomes and thatlineage specific expansions have occurred in turtles and birds. Dueto the relatively few reptile genomes available, it is not known if alarge number of b‐keratins exist in other squamates andcrocodilians, but it is clear that this family of proteins has beensubjected to multiple duplication events in the green anole lizardand birds.Expression of avian b‐keratins has been characterized for the

four major subfamilies (for review, see Greenwold andSawyer, 2010). For example, during embryogenesis, claw b‐

keratins are mainly expressed in the developing claws and beaks(Whitbread et al., '91; Wu et al., 2004), scale b‐keratins areexpressed mainly in the scutate scales and feather b‐keratins areexpressed in both embryonic and adult feathers (Preslandet al., '89a,b; Barnes, '93). The feather‐like subfamily is alsoexpressed in embryonic and adult feathers (Presland et al., '89a,b).The expression of other unique b‐keratin genes has also been

characterized, albeit, under unique in vitro conditions. The b‐

keratin isolated from jun‐transformed quail cells is similar insequence to the feather‐like b‐keratins of the chicken and zebrafinch, and is localized to a unique b‐keratin locus on chickenchromosome 6 (Hartl and Bister, '95; Greenwold andSawyer, 2010). Also, a b‐keratin isolated from cultured chickkeratinocytes (keratinocyte b‐keratin) has been described that islocated 30 to the four b‐keratin subfamilies on chicken micro-chromosome 25 (Vanhoutteghem et al., 2004; Greenwold andSawyer, 2010). It is unclear if these latter two b‐keratins areexpressed in the avian epidermis, but it has been shown that thekeratinocyte b‐keratin is highly similar to multiple expressedsequence tags (ESTs) from adult ovary andmulti‐tissue (embryo toadult) EST libraries (Greenwold and Sawyer, 2010).

2 GREENWOLD AND SAWYER

J. Exp. Zool. (Mol. Dev. Evol.)

Crocodilian b‐keratins are expressed in claws and scales (DallaValle et al., 2009a; Ye et al., 2010). Sawyer et al. (2000) identified a20 amino acid domain of a b‐keratin isolated from the claw of thealligator and Dalla Valle et al. (2009a) identified two b‐keratinsequences from mRNA extracted from the dorsal and ventral skinof juvenile Nile crocodiles. Although it is known that phyloge-netically the crocodilian b‐keratins are closely related to the clawand scale b‐keratins of birds, and that multiple b‐keratins withdifferent molecular weights are expressed in the epidermis ofcrocodilians (Sawyer et al., 2000; Alibardi and Thompson, 2002),little progress has been made in determining the completerepertoire of crocodilian b‐keratins and their specific expressionprofiles in different epidermal appendages.In this study we characterize the b‐keratin sequences in the

genomes of the American alligator and saltwater crocodile, andanalyze the expression of the b‐keratins in the alligator. Thecrocodilian b‐keratins are compared to the chicken and zebrafinch genomic sequences in order to investigate the evolutionof b‐keratins in these archosaurians and more specifically toanalyze the evolutionary origin of the avian feather b‐keratingenes.

MATERIALS AND METHODS

Genome SearchesA local BLAST search was conducted to identify and locate allprobable and pseudo b‐keratin genes in early and late versions ofthe American alligator (amiss_v0.1b27 and amiss_v0.2, respec-tively) and saltwater crocodile (cPorosus_v0.0.1 and croc_sub1.assembly, respectively) (Altschul et al., 1997; St. John et al., 2012).For the initial alligator and crocodile BLAST searches, three Nilecrocodile sequences were used as queries (NCBI GI numbers:219969046, 215541574, and 215541572; Dalla Valle et al., 2009a)with an E‐value cut‐off of 1E�10. Using the 12 alligator and 11saltwater crocodile sequences found in the first build, we used anE‐value cut‐off of 1E�10 to perform a search on the secondgenomic builds of the crocodilians.Upon analyzing the crocodilian b‐keratins, it became apparent

that several avian b‐keratins were previously excluded based onstringent E‐value cutoffs with previously characterized avian b‐

keratins. Therefore, we reanalyzed the data from Greenwold andSawyer (2010) and accepted an E‐value cutoff of 1E�10 andincluded all genes that had both start and stop codons.Our search of the crocodilian cDNA libraries were performed

using the b‐keratin sequences for each species as BLAST queries.In an effort to identify only specific b‐keratin sequences, we usedan E‐value cut‐off of 1E�40 for the cDNA library BLAST searches.Throughout this paper a probable coding gene is identified as a

b‐keratin that has a strong similarity to a known coding b‐

keratin, has a reasonably positioned start codon, and lackspremature stop codons and frameshift mutations throughout thecoding region. A pseudogene lacks one of the above stated criteria

for being a probable coding b‐keratin, but still has a strongsequence similarity (see below) to a known coding b‐keratin.Gene names are listed as a three letter representation of the

species name, for example,Crocodylus porosus is CPO, followed bythe chromosome number (when available), the specific gene nameBK (crocodilian b‐keratin), FK (avian feather b‐keratin), Cl (avianclaw b‐keratin), Sc (avian scale b‐keratin), Ktn (avian b‐keratinfrom cultured keratinocytes), FL (avian feather‐like b‐keratin) anda number, or in the case of crocodilian b‐keratins, a letterindicating the 50 to 30 orientation on each chromosome (ifapplicable). This nomenclature and gene annotation followsGreenwold and Sawyer (2010).

Phylogenetic AnalysisThe data set consisted of 361 b‐keratin sequences from thechicken, zebra finch (Greenwold and Sawyer, 2010), Americanalligator and saltwater crocodile (St. John et al., 2012), along withthe additional novel avian sequences described in results. The b‐keratins of the green anole are used as the out group (Dalla Valleet al., 2010; Warren et al., 2010). Molecular alignments wereperformed using ClustalW (Thompson et al., '96) with defaultparameters and the alignment was confirmed by visual inspection.Alignment figures were constructed with Bioedit software(Hall, '99). Bayesian phylogenetic reconstruction was performedby MrBayes 3.2 with the mixed amino acid model, two runs, twoswaps per 500 generations, and six chains (five heated chains andone cold chain) with convergence being confirmedwith a standarddeviation of split frequencies being <0.01 (Ronquist et al., 2012).The Bayesian analysis ran for 40,398,000 generations beforeconvergence was achieved. Rogue taxa were identified using theRogueNaRok program and removed from the dataset (Abereret al., 2013). After removal of rogue taxa, Bayesian analysis wasrepeated.

b‐Keratin ExpressionMacro tissue dissections from the American alligator's caruncle,claw and dorsal scale of individuals at embryonic stages (ES) 21,23, 25, and 27 (Ferguson, '85) and the domestic chicken's claw,ventral leg scale, dorsal feather tract, and egg tooth of 17 day chickembryos were fixed in RNAlater (Qiagen, Germantown, MD).Alligator tissue samples were kindly provided by Louis Gulliette atthe Medical University of South Carolina in Charleston, SC andchicken tissue samples were kindly provided by Richard Goodwinat the University Of South Carolina School Of Medicine inColumbia, SC. RNA extractions were performed using the QiagenRNeasy assay kit (Qiagen). RNA quality was checked by both theAgilent Technologies 2100 Bioanalyzer (Agilent Technologies,Palo Alto, CA) and Thermo Scientific NanoDrop 2000c spectrom-eter (Thermo Scientific, Waltham, MA). RNA quantity wasmeasured using the NanoDrop spectrometer. cDNA was synthe-sized using reverse transcription with 4.0 mM of MgCl2 using aPromega GoScript kit (Promega, Madison, WI).

J. Exp. Zool. (Mol. Dev. Evol.)

MOLECULAR EVOLUTION OF ARCHOSAURIAN ‐KERATINS 3

PCR primer design was conducted on unique regions from the12 American alligator b‐keratins and GGA25_ktn4, GGA25_ktn6,andGGA25_CL9b‐keratins using Primer3 online software (http://frodo.wi.mit.edu/; Rozen and Skaletsky, 2000). Primer specificitywas verified using UCSC genome browser PCR (Rozen andSkaletsky, 2000) against the genomes of the alligator and chickento ensure that only one product would be amplified. Ribosomalprotein L8 was used as a control for RT‐qPCR analyses and theforward and reverse primer sequences were characterized by Katsuet al. (2004). All primer sequences are listed in Table 1. Primerspecificity and qualitative expression analysis of day 17 embry-onic chicken and ES 25 alligator tissue were performed using the5Prime MasterTaq Kit (5Prime, Hamburg, Germany) usingstandard PCR and subsequently run on an agarose gel with ano template control.Quantitative expression analysis was done using RT‐qPCR on a

Bio‐Rad CFX96 Real‐Time PCR Detection System (Bio‐RadLaboratories, Hercules, CA). For every alligator developmentalstage and tissue type, three biological replicates and threetechnical replicates for each of the nine alligator b‐keratinswere performed. The qPCR run protocol used was a 30 sec 95.0°Cstep followed by 50 cycles of 95.0°C for 10 sec and an annealingtemperature gradient of 54.3–63.3°C at 30 sec concluding with amelt curve at 65.0–95.0°C with a 0.5° increment every 5 sec usingSsoFast EvaGreen Supermix (Bio‐Rad Laboratories).Statistical analysis of the qPCR data was performed using the

general linear model in SPSS 17 (SPSS Inc., Chicago, IL). Weperformed twomain types of general linear model analyses, one to

test for the overall effect of tissue type and developmental stage onb‐keratin expression and one to analyze how these factors effecteach individual b‐keratin gene's expression. The relative expres-sion for each gene was calculated using ribosomal protein L8 andanalyzed as repeated measures in an ANOVA with the embryonicstages and tissue types as the between subject factors. Univariateanalysis was performed with each gene's relative expression withES and tissue as the between subject factors. The relativeexpression of each b‐keratin was natural log‐transformed inorder to satisfy ANOVA assumptions of analysis.

RESULTS

Genome SearchesThe number of b‐keratins found in American alligator andsaltwater crocodile is less than that of the green anole, turtles, orbirds. We found a total of 20 and 21 unique b‐keratins in theAmerican alligator and saltwater crocodile, respectively. In theAmerican alligator we found 12 genes in the first build and eightadditional genes in the second build with six b‐keratins commonto both builds. In the saltwater crocodile, we also found 12 b‐

keratin genes in the first build and nine additional genes in thesecond build with 11 overlapping genes between the two builds.Several loci in both the alligator and crocodile contain linked b‐

keratins with the crocodile genome having five linked genes onone scaffold (see Table S1).Interestingly, the 20 amino acid domain expressed in the

alligator claw (Sawyer et al., 2000) has 100% similarity to a 20

Table 1. Primer sequences and optimal primer concentrations and efficiencies for primers used in RT‐qPCR.

Gene name Forward primer (50–30) Reverse primer (30–50)Optimal primer

concentration (nM)Primer

efficiency (%)

AMI_BK_A GTTCTGGTGGCTATGGAGGA AGCCAAAGCCTGAACCATAC 300 98.8AMI_BK_C GAAATGCCAGCCTCAGTACC GCGGTTTGTATTGCAAGGAT N/A N/AAMI_BK_E GGGGAACTGCCCTGTCTAA ACAGCTCCCAGAATCAGAGC 250 102.2AMI_BK_H AAGTGTTGTGGGCTCCTCAG CCACAGCTTCCATTGCTGTA 300 97.5AMI_BK_I TGGGGGCTCTTTCACTTATG TGCACTTGGTTTCTTGGATG N/A N/AAMI_BK_K CGAAATCATGGGGAATTCAG CTCCTCCTCTGGCTCTTCCT 150 98.6AMI_BK_O TTTAGGGGCACCTAGCACTG TATTCAATGGGCGGGAATAA 250 101.6AMI_BK_P TGGGGGCTCTTTCACTTATG TGCACTTGGTTTCTTGGATG 200 101.7AMI_BK_Q ATTTTCTGGACCGCTGGAAT GTCATGCTTTTGCTGGGTTT N/A N/AAMI_BK_R GGCTATGGGATTGGTGGTT CCACAGCCTACAGAGCTGAAT 250 103.0AMI_BK_S ATCAGATGATGCGACACTGG GGTGTATCCATTGGGTGGAA 300 99.0AMI_BK_T TGGTGGGATCCTCTGCTTTA TTAAACAGGCCCACAGTTCC 200 99.6L8 (Katsu et al. 2004) GGTGTGGCTATGAATCCTGT ACGACGAGCAGCAATAAGAC 250 100.6GGA25_ktn4 ATCCTCAGCTCCTGCCCTCA GCACCGTAACCACCGTACCC N/A N/AGGA25_ktn6 ACTTGGCTATGGGGAGTCCT TTCGGTCTCCTGACTGCTCT N/A N/AGGA25_CL9 CTCGGAGTCCGCAGAAGT ACGAATTGCATGGATGTGTC N/A N/A

Genes with N/A were used as PCR probes, but not in RT‐qPCR.

J. Exp. Zool. (Mol. Dev. Evol.)

4 GREENWOLD AND SAWYER

amino acid peptide domain in AMI_BK_P and G. This domain islocated in the 34 amino acid domain that makes up the b‐keratinfilament domain as described by Fraser and Parry (2011b).Furthermore, the highly conserved (L/IGPG) turn residues betweenb strands three and four are also located within the 20 amino acidpeptide domain. Fraser and Parry (2008) suggest that this highlyconserved and atypical site may represent the nucleating site andthereby play a role in determining the structure of theb‐sheet. This20 amino acid domain (Sawyer et al., 2000) is referred to as the“core box” by Alibardi and Toni (2008).While the crocodilian sequences published for the Nile crocodile

(Cr‐gptrp‐1–3), marsh mugger and Orinoco crocodile (Dalla Valleet al., 2009a; Ye et al., 2010) are more similar to the saltwatercrocodile than to the American alligator, we were unable to findany exact matches to the crocodilian genomic sequences detailedin this study. All of the Nile crocodile, marsh mugger and Orinococrocodile b‐keratins were most similar to CPO_BK_B (87–92%similarity) except for the Nile crocodile sequence, Cr‐gptrp‐2, thatwas 98% similar to CPO_BK_A. All of these crocodilian sequenceshave multiple GGX repeats in the C‐terminus (Dalla Valleet al., 2009a; Ye et al., 2010). The GGYGGL amino acid motif isrepeated five times in the CPO_BK_A and Orinoco crocodile b‐

keratin and six times in Cr‐gptrp‐1, whereas, CPO_BK_B only hasseveral interspersed GGX repeats similar to Cr‐gptrp‐2. Althoughthe C‐termini of some alligator b‐keratins have high glycinecontent (up to 42.3%) and interspersed GGX motifs, they lack theregularity seen in crocodilian species.

Searching the crocodilian genomes has allowed greater insightinto the diversity of avian b‐keratins through the identificationand characterization of novel archosaurian b‐keratins. We foundthat two crocodilian b‐keratins (CPO_BK_Q and AMI_BK_C)closely resemble the chicken and zebra finch claw b‐keratins. Twoother b‐keratins (AMI_BK_I and CPO_BK_O) resemble the b‐

keratins from cultured keratinocytes (Vanhoutteghem et al., 2004).However, these sequences contain a 26 and 22 amino acidinsertion, respectively; in the C‐terminal domain (see Fig. 1A,B).Using the genomic crocodilian b‐keratins described in this

study as queries, we were able to locate highly similar b‐keratingenes containing the aforementioned insertions (Fig. 1) in thechicken and zebra finch. These avian b‐keratins were omittedfrom our original analysis (Greenwold and Sawyer, 2010) due tostringent E‐value cutoffs. Therefore, we reanalyzed the data setfrom Greenwold and Sawyer (2010), and identified an additional22 b‐keratins in the chicken genome for a total of 133 b‐keratinsequences. This number is closer to earlier copy number estimatesby the International Chicken Genome Sequencing Consortium(2004) and Alibardi and Toni (2008) of between �137 and 150 b‐keratins in the chicken genome. We added 41 b‐keratins to thezebra finch genome for a total of 149 sequences (Table S1).Among these new sequences are several genes similar to the b‐

keratin from cultured keratinocytes. Previously, only onekeratinocyte b‐keratin had been identified in the chicken, andnow 11 unique sequences have been identified in the chicken and10 in the zebra finch, with at least onemajor indel (Fig. 1). Also, we

Figure 1. Alignment of novel archosaur b‐keratins similar to the avian claw b‐keratin (A) and b‐keratin from cultured keratinocytes (B). Theunique amino acid indel sequences are underlined. The common species name as well as the specific b‐keratin annotation is listed for eachsequence. Color coding represents amino acid similarity at a 30% threshhold. Green are identical, blue are similar, and red are non‐similaramino acids.

J. Exp. Zool. (Mol. Dev. Evol.)

MOLECULAR EVOLUTION OF ARCHOSAURIAN ‐KERATINS 5

have identified six additional chicken scale b‐keratins to make atotal of 10 scale b‐keratins in the chicken and five in the zebrafinch. In the zebra finch, 31 of the 41 additional sequences aresimilar to feather b‐keratins (Table S1).

Phylogenetic AnalysisWe performed a Bayesian analysis of b‐keratins from theAmerican alligator, saltwater crocodile, chicken, zebra finch,and green anole. The data set consisted of the 20 Americanalligator and 21 saltwater crocodile b‐keratins identified above aswell as the expanded chicken and zebra finch b‐keratin genefamilies. Only 38 of the 40 green anole b‐keratins identified byDalla Valle et al. (2010) were used as the outgroup due toACA_Ac38 and 40 being identified as rogue taxa by the

RogueNaRok program and removed from the data set (Abereret al., 2013).The phylogenetic analysis (Figs. 2 and S1) indicates that

crocodilian and avian b‐keratins can be divided into a number ofclades. First, there is an avian specific basal clade (A) containingone zebra finch and two chicken b‐keratins which are similar tothe b‐keratin from cultured keratinocytes (Vanhoutteghemet al., 2004). These are located between the feather‐like and scaleb‐keratin genes onmicrochromosome 25 of the chicken and zebrafinch (Fig. 2, clade A; Greenwold and Sawyer, 2010). The secondclade contains the crocodilian b‐keratins and the remaining avianb‐keratins (Fig. 2B,C clades).The largest archosaurian clade can be broken down into two

main clades (Fig. 2, clade B and C). Each of thesemajor clades has a

Figure 2. Phylogeny of the b‐keratins from the American alligator, saltwater crocodile, chicken, zebra finch, and green anole. Posteriorprobabilities are shown for the major nodes. Only major clades are highlighted, see Figure S1 for specific gene names. Color coding is based onb‐keratin nomenclature and/or species. Black represents green anole b‐keratins, red represents b‐keratins from cultured keratinocytes, darkblue represents claw b‐keratins, light blue (teal) represents crocodilian b‐keratins and green represents feather and feather‐like b‐keratins.

J. Exp. Zool. (Mol. Dev. Evol.)

6 GREENWOLD AND SAWYER

basal group consisting of genes similar to avian claw b‐keratinsalong with their crocodilian orthologs. Clade B1 consists of oneclaw gene from each of the four species. C1 consists of clawb‐keratins 1–8 of the chicken and a zebra finch gene onchromosome unknown. The crocodile and alligator have two andone paralogous genes, respectively, in C1. Interestingly, thealligator genes (AMI_BK_Pand G) that have 100% similarity to the20 amino acid sequence isolated from the alligator claw (Sawyeret al., 2000) are found in C1. The final claw b‐keratin clade, B3,contains four novel claw genes which contain the unique 26amino acid insertion (Fig. 1A) not found in other avian clawb‐keratins. Clade B4 consists of the b‐keratin from culturedkeratinocytes first described by Vanhoutteghem et al. (2004), thenovel keratinocyte b‐keratins with the 22 amino acid insertion(Fig. 1B) and their crocodilian orthologs‐a total of more than30 b‐keratins.Two crocodilian specific intermediate clades are characterized

in clade C. Clade C2 contains four b‐keratins, two from eachcrocodilian species. Clade C5 consists of three genes from eachspecies, which form a 1:1 ortholog pair and a 2:2 ortholog pair.While the b‐keratins of C2 are linked, only two out of threeb‐keratins from each species are linked from clade C5. These dataindicate lineage specific duplications occurred in crocodilianspecies (Fig. S1).The avian scale b‐keratins are divided into two bird specific

clades (C3 and C4), which are sister clades to the crocodilianspecific clade C5 and the largest avian specific clade (C6). Clade C6consists of the feather, feather‐like and BKJ genes from thechicken and zebra finch (Fig. 2). The feather‐like b‐keratins onmicrochromosome 25 of the chicken and zebra finch and the BKJgenes (similar to feather‐like genes on chromosome 6 of thechicken) are the basal genes, giving rise to the feather b‐keratinson microchromosome 25, chromosome 2, and microchromosome27 of the chicken and zebra finch (Greenwold and Sawyer, 2010)(Fig. S1).

Amino Acid Composition of b‐KeratinsFraser and Parry (2011a) characterized the amino acid content ofthe filament region and the C and N‐terminal domains usingrepresentative sequences of reptilian and avian b‐keratins. In thisstudy, we have utilized the whole genome b‐keratin repertoires oftwo crocodilian and two avian species. Table 2 lists thepercentages of amino acids in the archosaurian b‐keratins,grouped according to our phylogenetic clades in Figure 2, dividedinto the three b‐keratin protein domains.We found that the N‐terminal domain of archosaurian b‐

keratins is slightly variable in length with 21–49 amino acids formost clades, but clade C6, with the highest number of sequences(N ¼ 224), has a length of 9–76 amino acids. This is in contrast toFraser and Parry's (2011a) results that reported only 22–26 aminoacids in the N‐terminus (Table 2). Although variation in lengthexists between the different clades, they all have a high content of

Table2.

Aminoacid

compositio

nsof

theN‐terminal,fi

lamentandC‐term

inalb‐keratin

domains

foreach

phylogeneticcladefrom

Figure

2.

Clade

N

Domain

N‐Term

inal

Filament

C‐Term

inal

nIle

Ala

Cys

Pro

Ser

Gly

Tyr

Glu

Val

nIle

Ala

Cys

Pro

Ser

Gly

Tyr

Glu

Val

nIle

Ala

Cys

Pro

Ser

Gly

Tyr

Glu

Val

A3

24–27

3.85

3.84

614

.111.54

20.51

5.13

3.84

611.54

3.84

634

5.88

3.92

22.9412

19.61

8.82

42.941

02.941

23.53

153–18

30.20

22.22

22.42

41.212

22.63

37.78

8.28

30

0.40

4

B14

31–32

1.59

11.11

14.29

12.7

15.08

4.76

03.17

510.32

347.35

0.73

53.67

6520

.59

8.82

42.941

2.9411.471

21.32

39–62

0.55

311.6

6.077

8.84

14.37

28.18

5.52

41.105

9.39

2

B22

270

5.55

622

.22

18.52

7.407

03.704

3.704

1.85

234

10.3

05.88

2417

.65

19.12

4.114

2.941

016

.18

46–77

5.691

5.691

2.43

94.87

812

.216

.26

1.62

64.87

85.691

B311

21–32

1.24

7.43

16.41

16.1

11.15

4.02

0.31

7.73

94.02

534

11.2

1.33

95.08

8218

.72

5.34

74.27

82.9412.941

20.32

70–17

73.031

9.51

4.614

12.05

12.9

6.78

3.861

1.318

5.17

9

B435

25–49

1.96

6.53

15.3

14.37

13.53

2.33

4.75

84.94

44.851

34,3

77.67

4.83

75.25

4416

.612

.93

5.421

3.67

3.08

616

.140

–317

2.17

22.26

53.32

4.49

919

.86

31.77

8.62

62.04

81.45

8

C115

24–32

2.58

14.99

15.93

15.46

14.52

5.15

0.703

3.513

5.15

234

6.08

5.88

22.9412

17.06

10.98

3.33

30.19

60.39

218

.82

53–77

0.38

44.99

54.32

33.651

7.87

743

.91.24

90

1.921

C24

273.7

11.11

17.59

11.11

13.89

4.63

5.55

63.704

034

10.3

2.941

2.9412

20.59

9.55

92.941

00

16.91

77–115

1.30

25.46

93.90

63.64

611.72

39.84

14.58

0.26

2.34

4

C310

27–36

3.59

6.53

614

.71

14.71

11.44

0.33

3.59

56.53

65.55

634

5.88

8.52

92.9412

20.59

102.941

00

1558

–70

0.32

912

.34

7.73

8.38

812

.83

23.19

4.77

00.65

8

C47

285.61

6.12

212

.76

13.27

11.74

8.67

5.102

3.571

2.551

345.88

3.361

2.9412

19.33

10.92

2.941

00

19.33

85–101

0.617

2.77

83.54

93.08

615

.936

.11

18.06

01.23

5

C56

27–36

3.3

9.89

15.39

10.99

8.791

2.75

6.59

33.29

73.29

734

6.86

2.941

3.9216

20.59

10.29

3.92

20.98

0.49

20.59

63–13

40

2.79

64.441

3.618

16.45

32.57

13.98

0.82

23.12

5

C622

49–

760.24

4.84

825

.718

.15

11.03

4.22

1.28

84.06

80.57

634

7.1

4.29

42.9149

14.33

11.95

2.941

0.19

71.06

417

.24

32–72

5.801

5.617

7.48

64.56

519

.57

22.87

3.48

21.491

4.351

Threeletter

abbreviatio

nswereused

fortheam

inoacid

names.C

apita

lNdenotesthenumberof

taxa

ineach

clade.Lowercase

ndenotestherangeof

thenumberof

aminoacidsforeach

clade.

J. Exp. Zool. (Mol. Dev. Evol.)

MOLECULAR EVOLUTION OF ARCHOSAURIAN ‐KERATINS 7

cysteine, proline, and serine. Interestingly, clade C6 has ultra‐highcysteine content with more than 25% (Table 2). Additionally, weobserved that this domain often ends in the PCmotif although to alesser extent than the C‐terminus where 99% of the sequences endin the PC motif (Fraser and Parry, 2011a).The 34 amino acid central domain is highly conserved

throughout reptiles and birds and makes up the 2–3 nm filament(Fraser and Parry, 2011a,b). We found that all archosauriansequences have high levels of proline and valine. Also, three b‐

keratins from clade B4 have 37 amino acids in their filamentdomain (Fig. S2). These sequences, one each from the chicken,zebra finch, and alligator appear to have had a three amino acidinsertion after the 14th amino acid position of the filament. Theseinsertions occur within one of the hairpin turns of the b‐strand aspresented by Fraser and Parry (2011a). The chicken and zebra finchb‐keratins with the 37 amino acid filament insertion are verysimilar (�92%), but the alligator sequence is much different fromeither bird b‐keratin (�56% similarity).The domain with the greatest variability is the C‐terminus,

where amino acid length varies from 32 to 317 amino acids. Whileall clades are rich in glycine and serine, three clades also contain ahigh level of tyrosine (�14%). These high tyrosine‐glycine‐serineclades are C2; a crocodilian specific clade basal to the avian scaleand feather b‐keratins, C4; avian specific scale b‐keratins and C5;crocodilian specific b‐keratins monophyletic with clade C4(Table 2). High tyrosine‐glycine proteins are seen in both reptileand mammal integuments. For example, Dalla Valle et al. (2009b)found that the b‐keratins of the turtle up to 12.6% tyrosine andFratini et al. ('93) confirmed the presence of the glycine/tyrosine‐rich Keratin Associated Protein 6 (KAP6) gene family's presence insheep, rabbit, and mice hair/wool. As mentioned earlier crocodil-ian and avian scale and claw b‐keratins have glycine rich repeatsin the C‐terminus, but interestingly the claw b‐keratins of cladeB3 have an unusually low amount of glycine (6.78%, Table 2).

b‐Keratin ExpressionIn order to investigate the specific expression patterns ofcrocodilian b‐keratins, we performed RT‐qPCR analyses on thecaruncle, claw, and scale of alligators at embryonic stage (ES) 21,23, 25, and 27. A total of nine (Table 1) American alligator b‐keratin genes were used for RT‐qPCR profiling and they werecompared to ribosomal protein L8 (Katsu et al., 2004) to giverelative expression values. Although AMI_BK_I was found to beexpressed in the caruncle and scale during ES 25 with PCRanalysis, extremely low expression levels prevented primeroptimization and further expression analyses. AMI_BK_C and Qshowed no expression for the three tissue types at ES 25. Theremaining alligator b‐keratins were found in the second build andwere not investigated in this study.Overall, relative b‐keratin expression varies as a function of

time and tissue type (Table 3). Relative expression was lowest at ES21 and increased during development through ES 27. Also, b‐

keratin expression was highest in the claw and caruncle comparedto the scale during development. Although claw tissue had higherexpression than caruncle, as explained by themain effect of tissue,the Bonferroni multiple comparisons of means indicated theirdifferences are not significant and that both are significantlydifferent from scale tissue. Repeated measures analysis indicatedan overall variation in expression of the nine alligator b‐keratins(Fig. 3, AMI_BK_A > E > H > O > T > S > P > R > K). Thewithin treatment effects indicate that the embryonic stage, tissuetype and the combination of embryonic stage and tissue type allhad different effects on different b‐keratins (Table 3).The within treatment effects of individual b‐keratin genes were

analyzed with Univariate analyses. The results (Table S2) showthat the expression of the b‐keratins varied as a function ofembryonic stage, tissue type, and the interaction betweenembryonic stage and tissue type. Bonferroni post‐hoc analysesrevealed little continuity between the b‐keratins. For tissue type,most genes' (AMI_BK_A, H, O, P, S, and T) marginal meanscomparison for caruncle and claw lacked significance. WhileAMI_BK_E and R are only significantly differently expressed forthe caruncle and scale, AMI_BK_K is only different between theclaw and scale. The marginal mean comparisons for embryonicstage varied greatly for the b‐keratins, with only AMI_BK_Thaving significantly different expression for all pairwise compar-isons of the four embryonic stages. The remaining eight genes'expression differed between at least two embryonic stages (Fig. 3).The cDNA libraries for the American alligator, saltwater

crocodile, and gharial were constructed by the InternationalCrocodilian Genomes Working Group and cover a large range oftissues (downloadable at www.crocgenomes.org, St. Johnet al., 2012). Using a stringent E‐value cutoff, we found thatfive alligator b‐keratin sequences (AMI_BK_C, D, K, S, and T) were

Table 3. Repeated measures of natural log‐transformed relativeb‐keratin expression.

Source df MS F P

(A) Between‐subjecteffectsEmbryonic stage (ES) 3,24 1157.76 59.50 <0.001Tissue (T) 2,24 366.50 18.86 <0.001ES � T 6,24 207.54 10.67 <0.001

(B) Within‐subject effectsb‐Keratins 8,192 399.30 196.98 <0.001b‐Keratins �embryonic Stage

24,192 55.97 27.61 <0.001

b‐Keratins �Tissue

16,192 32.51 16.04 <0.001

b‐Keratins � ES � T 48,192 14.70 7.25 <0.001

df, degrees of freedom; MS, mean‐square; F, F‐test statistic; P, probabilityvalue.

J. Exp. Zool. (Mol. Dev. Evol.)

8 GREENWOLD AND SAWYER

expressed in 16 different cDNA libraries (see Table S3). Theseincluded tissues from the skin of the belly, spleen and white matterof the brain. Interestingly, AMI_BK_C was expressed in 15libraries, but our primers for AMI_BK_C seemingly failed to workor this gene is not expressed in the dorsal scale, claw, and caruncleduring ES 25. AMI_BK_D was not investigated using qPCR, butwas found to be expressed in the white matter of the brain, spinalcord, and tooth libraries. The remaining b‐keratins (AMI_BK_K, S,

and T) now have two sources of expression data (cDNA librariesand qPCR analysis) detailing their expression pattern in Americanalligators.The expression of the b‐keratin from cultured chicken

keratinocytes in normal chickens is unknown and with thedetection and annotation of several avianb‐keratins similar to thekeratinocyte b‐keratin it seemed reasonable to probe theirexpression in normal chickens (Vanhoutteghem et al., 2004).

Figure 3. Natural log‐transformed relative expression of each of the b‐keratins as a function of time and tissue type.

J. Exp. Zool. (Mol. Dev. Evol.)

MOLECULAR EVOLUTION OF ARCHOSAURIAN ‐KERATINS 9

Therefore, we investigated the expression profiles for thekeratinocyte b‐keratin (GGA25_Ktn4), the novel claw gene(GGA25_CL9) and the novel keratinocyte b‐keratin(GGA25_Ktn6) discovered from the analysis of the crocodiliangenomes (Fig. S3). RNA was extracted from the claw, egg tooth,dorsal feather tract, and ventral leg scales of 17 day chickembryos. Specific primers for each gene are listed in Table 1.Figure S2 illustrates that GGA25_Ktn4 and GGA25_CL9 areexpressed in all four embryonic chick tissues and thatGGA25_Ktn6 is only expressed in embryonic feather. Thekeratinocyte b‐keratins appear to be a novel subfamily, whichhas differentially expressed members in the chicken.

DISCUSSION

Phylogenetic AnalysisOur phylogeny supports the view that the archosaurian claw andkeratinocyte subfamilies of b‐keratins are basal to the scale,feather‐like, and feather subfamilies of b‐keratins, and that thefeather‐like b‐keratins are basal to all other feather b‐keratins.Furthermore, we find that the feather‐like and feather clade of b‐keratins forms a sister group with two avian scale clades and acrocodilian specific clade of b‐keratins. Previous phylogeneticanalyses of b‐keratins differ from the present findings (Dalla Valleet al., 2009a,b, 2010; Greenwold and Sawyer, 2010). Dalla Valleet al. (2009a) found that feather b‐keratins and all otherarchosaurian b‐keratins (Nile crocodile b‐keratins, avian scale,claw, and keratinocyte b‐keratins) have a most recent commonancestor. A separate study (Dalla Valle et al., 2009b) using turtleb‐keratins found that three Nile crocodile b‐keratins are basal to twoavian clades consisting of (1) avian claw and feather b‐keratinsand (2) avian scale and keratinocyte b‐keratins. Using thecomplete genomic repertoire of anolis b‐keratins, Dalla Valle et al.(2010) found that all archosaurian b‐keratins share a commonancestor. The authors of these three studies only used 1–4sequences to represent crocodilian b‐keratins and the avian b‐

keratin subfamilies (scale, claw, feather and keratinocyte b‐

keratins), which may have failed to capture the multiplicity ofavian and crocodilian b‐keratins. Also, Greenwold and Sawyer(2010), while using a nearly complete genomic repertoire ofchicken and zebra finch b‐keratins only used three crocodilian b‐

keratins and found that the avian scale genes were basal to allother avian b‐keratins and that the avian claw genes were basal tofeather‐like and feather b‐keratins. In the present study we usedthe entire genomic repertoire of b‐keratins from the lizard, twocrocodilians, and two birds, which has allowed better resolution.However as more whole genome sequences become available, ourunderstanding of the evolutionary history of gene families andspecies will continue to improve.Recent studies using large scale molecular data sets have

verified the position of turtles as sharing a common ancestor witharchosaurians (Chiari et al., 2012; Crawford et al., 2012). Li et al.

(2013), using b‐keratins from three turtle genomes, two birdgenomes and the green anole, found that a subgroup of turtle, andbird b‐keratins forms the ancestral clade. Their ancestral cladeconsisted of many turtle b‐keratins and only four zebra finch b‐

keratins, but no chicken b‐keratins. We annotated these fourgenes as TGU25_Ktn4, TGU25_Ktn8, TGU_UnR_Ktn1, andTGU25_CL4. The three zebra finch keratinocyte b‐keratins arefound in clade B4 and the claw b‐keratin is located in clade B3 ofour phylogeny (Fig. 2). Both of these clades contain orthologouscrocodilian b‐keratins suggesting that clades B3 and B4 may bethe ancestral b‐keratins shared by turtles and archosaurs.However, our phylogeny showed that clade A was the basalarchosaurian clade, but the lack of crocodilian b‐keratins in cladeA and the results from Li et al. (2013) suggests that clade A is notthe basal clade of turtles and archosaurian.

Genomic Localization of Archosaurian b‐KeratinsLinked genes belonging to the same protein family are oftentandemly duplicated and phylogenetically similar. However, wefind that linked crocodilian b‐keratins are often a mix of earlyduplication events and more recent lineage specific crocodilianduplications. In the American alligator, the b‐keratins of twoscaffolds (Scaffold‐1186 and 14068), which have more than twob‐keratins, are distributed between two different phylogeneticclades. AMI_BK_A and B (clade C5) appear to be recent duplicatesas well as AMI_BK_D and E (Clade C2). The saltwater crocodileScaffold_09063 has five linked b‐keratins (CPO_BK_A‐E) be-longing to three clades (B4, C2, and C5). The arrangement of theseloci is reminiscent of microchromosome 25 of the chicken andzebra finch, being composed of b‐keratins from multiplesubfamilies, which are tandemly arrayed gene duplicates fromthe same subfamily (Greenwold and Sawyer, 2010). In birds, onlyfeather b‐keratins are localized to chromosomes other thanmicrochromosome 25 suggesting that all of the crocodilian b‐

keratins may be localized to a single chromosome. Recently Liet al. (2013), while studying the b‐keratins in the genomes of threeturtles, have demonstrated syntenic conservation of the b‐keratincluster on the bird microchromosome 25.

b‐Keratin ExpressionThe expression profiles for the nine b‐keratins studied using qPCRseem to have little or no correlation with their phylogeneticrelatedness. However, all phylogenetic clades containing acrocodilian b‐keratin are expressed in the epidermal appendagesstudied, except for clade B2, which has only one member fromeach of the two crocodilian species. These data suggest that duringcrocodilian embryonic development epidermal appendages use abroad spectrum of b‐keratins and that the combination ofdifferent proteins provides structural advantages. In birds, theavian subfamily of feather b‐keratins is specifically expressed infeathers (Presland et al., '89a) indicating that, to some degree, theb‐keratin expression has become more specialized in birds and

J. Exp. Zool. (Mol. Dev. Evol.)

10 GREENWOLD AND SAWYER

that the pleisiomorphic integument of the ancestral archosaurslikely utilized the full repertoire of b‐keratins.Interestingly, three out of thefive alligator b‐keratins expressed

in the cDNA libraries make up alligator b‐keratins of clade B3(Fig. 2). Clade B3 contains one of the novel claw b‐keratins, withthe 26 amino acids insertion (Fig. 1A) that was expressed in allepidermal tissue studied from 17‐day chick embryos. These resultsindicate that members of clade B3 are highly expressed in bothcrocodilians and birds.Expression library searches in this and other studies (Dalla Valle

et al., 2010; Greenwold and Sawyer, 2010) have identified severalreptilian and avian b‐keratins that are expressed in tissues notrelated to the integument. b‐Keratin expression in tissues such asovaries, testes, brain, and kidney indicates that b‐keratins aremore widely expressed than previously thought and may havenovel functional roles in reptiles and birds (Dalla Valle et al., 2010;Greenwold and Sawyer, 2010). Interestingly, squamates, crocodi-lians, and birds appear to express b‐keratins in the testesindicating a conserved genic function in reptiles (Dalla Valleet al., 2010; Greenwold and Sawyer, 2010).

Expansion of b‐Keratins in ArchosauriansThe phylogeny of archosaurian b‐keratins in this article clearlyindicates gene diversification and multiple gene duplicationevents in the crocodilian and avian lineages, albeit to a muchlesser extent in the alligator and crocodile. Therefore, the b‐

keratin repertoire of basal archosaurs was likely smaller thaneither of today's archosaurian lineages. Li et al. (2013) suggestedthat themost recent common ancestor of turtles and birds had<30b‐keratins. Based on our copy number data for crocodilians andour phylogenetic analysis, it seems likely that the ancestor ofarchosaurs and turtles had far <30 b‐keratins.The molecular evolution of b‐keratins, based on our phylogeny

(Fig. 2), indicates that avian b‐keratins have evolved anddiversified at an accelerated rate compared to crocodilian b‐

keratins. The expansion of b‐keratins may not only be related tothe evolution of feather morphology, butmay also be related to thephysical demands of flight (Greenwold and Sawyer, 2011). In fact,it has been shown that flight and the metabolic requirementsassociated with flight have resulted in major genome widechanges of flying amniotes (Zhang and Edwards, 2012). Further-more, Li et al. (2013) suggest that the expansion ofb‐keratin genescontributed to the phenotypic differences between turtles andbirds.The feather (Clade C6, Fig. 2) and scale b‐keratins (Clade C3 and

4, Fig. 2) along with 3 b‐keratins from each crocodilian species(Clade C5, Fig. 2) form a monophyletic clade. Basal to these genesis another crocodilian specific clade (Clade C2, Fig. 2) and theavian and crocodilian claw clade containing the claw type b‐

keratins described by Presland et al. ('89b). Gregg et al. ('84) andGregg and Rogers ('86) have previously suggested that the featherand feather‐like b‐keratins evolved, through the loss of the

glycine and tyrosine rich repeats in the N‐terminal domain, fromscaleb‐keratins. The inclusion of the crocodilianb‐keratins in thisstudy's phylogenetic analysis demonstrates that avian feather andscale b‐keratins likely evolved from archosaurian claw b‐keratinsand that the avian specific feather b‐keratins are a novelsubfamily found exclusively in birds, which first appeared �124mya (Greenwold and Sawyer, 2011). In general our observationssupport the conclusions of Greenwold and Sawyer (2010, 2011)that the genome of early archosaurs contained a cluster of linkedb‐keratins closely related to the keratinocyte, claw and scale genesseen in today's crocodilians and birds.The molecular evolution of archosaurian b‐keratins has led to

the avian specific feather b‐keratins that have a generally smallerC‐terminus and a high N‐terminus content of cysteine. b‐Keratinsform repeating units of perpendicular dyads arranged on a helix.Cysteine has been shown to oxidize and form cross‐links betweenpairs of b‐keratins (Fraser and Parry, 2011b). The combination of ahigh feather b‐keratin copy number and high cysteine concen-tration may enable the production of multiple feather dyads. Alarge number of feather b‐keratin dyad combinations wouldseemingly convey numerous structural characteristics that wouldhelp explain the diversity of feathers seen in today's birds.A long held view has been that avian scales (and feathers)

developed from reptilian scales. However, it has been suggestedthat the scales of birds evolved separately and distinctly in theavian lineage and did not evolve from reptilian scales (Sawyeret al., 2005; Dhouailly, 2009). The development of avian scales,like that of feathers, involves the formation of a placode(Sawyer, '72, '83; Maderson and Sawyer, '79; Dhouailly, '84),which is absent in reptilian scale development. Also, studies onchickenmutants have shown that avian scutate scale developmenton the legs of chickens requires the repression of featherdevelopment leading to the hypothesis that avian scales aresecondarily derived structures (Brotman, '77; Sawyer andKnapp, 2003). This view is supported by the discovery of the“four‐winged” theropod, Anchiornis huxleyi, and the “hindwings” on 11 basal bird specimens (Hu et al., 2009; Zhenget al., 2013). The present study has demonstrated that avian scaleand feather b‐keratins evolved in the avian lineage. Therefore, wesuggest that the molecular evolution of b‐keratins led to theformation of the novel avian scale and feather b‐keratins and thatthe expansion of feather b‐keratins accompanied the evolution offeathers.

ACKNOWLEDGMENTSWe thank all of the members of the International CrocodilianGenomes Working Group for their work on the publication of thecrocodilian genomes, Louis Guillette and his lab at the MedicalUniversity of South Carolina for kindly providing alligator tissuesamples, Rich Goodwin at the University of South Carolina Schoolof Medicine for kindly providing chicken embryos and JeffDudycha and his lab at the University of South Carolina for their

J. Exp. Zool. (Mol. Dev. Evol.)

MOLECULAR EVOLUTION OF ARCHOSAURIAN ‐KERATINS 11

expertise and assistance with our expression analyses. We alsothank our anonymous reviewers for their helpful suggestions andcomments.

LITERATURE CITEDAberer AJ, Krompass D, Stamatakis A. 2013. Pruning rogue taxaimproves phylogenetic accuracy: an efficient algorithm andwebservice. Sys Biol 62:162–166.

Alföldi J, Di Palma F, Grabherr M, et al. 2011. The genome of the greenanole lizard and a comparative analysis with birds and mammals.Nature 477:587–591.

Alibardi L, Sawyer RH. 2002. Immunocytochemical analysis of beta (b)keratins in the epidermis of chelonians, lepidosaurians, andarchosaurs. J Exp Zool 293:27–38.

Alibardi L, Thompson MB. 2002. Keratinization and ultrastructure ofthe epidermis of late embryonic stages in the alligator (Alligatormississippiensis). J Anat 201:71–84.

Alibardi L, Toni M. 2008. Cytochemical and molecular characteristicsof the process of cornification during feather morphogenesis. ProgHistochem Cytochem 43:1–69.

Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W,Lipman DJ. 1997. Gapped BLAST and PSI‐BLAST: a new generationof protein database search programs. Nucleic Acids Res 25:3389–3402.

Baden HP, Maderson PFA. 1970. Morphological and biophysicalidentification of fibrous proteins in the amniote epidermis. J ExpZool 174:225–232.

Barnes GL. 1993. Differentiation of the embryonic skin of the chicken[PhD Thesis]. Columbia, SC: University of South Carolina.

Brotman H. 1977. Epidermal–dermal tissue interactions betweenmutant and normal embryonic back skin: site of mutant geneactivity determining abnormal feathering is in the epidermis. J ExpZool 200:243–258.

Chiari Y, Cahais V, Galtier N, Delsuc F. 2012. Phylogenomic analysessupport the position of turtles as the sister group of birds andcrocodiles (Archosauria). BMC Biol 10:65.

Crawford NG, Faircloth BC,McCormack JE, et al. 2012. More than 1000ultraconserved elements provide evidence that turtles are the sistergroup of archosaurs. Biol Lett 8:783–786.

D'Alba L, Saranathan V, Clarke JA, et al. 2011. Colour‐producingb‐keratin nanofibres in blue penguin (Eudyptula minor) feathers.Biol Lett 7:543–546.

Dalla Valle L, Nardi A, Gelmi C, et al. 2009a. b‐Keratins of theCrocodilian epidermis: composition, structure, and phylogeneticrelationships. J Exp Zool (Mol Dev Evol) 312B:42–57.

Dalla Valle L, Nardi A, ToniM, Emera D, Alibardi L. 2009b. Beta‐keratinsof turtle shell are glycine‐proline‐tyrosine rich proteins similar tothose of crocodilians and birds. J Anat 214:284–300.

Dalla Valle L, Nardi A, Bonazza G, et al. 2010. Forty keratin associatedb‐proteins (b‐keratins) form the hard layers of scales, claws, andadhesive pads in the green anole lizard, Anolis J Exp Zool (Mol DevEvol) 314B:11–32.

Dhouailly D. 1984. Specification of feather and scale patterns. In:Malacinski GM, Bryant SW, editors. Pattern formation. New York:Macmillan Publ. p. 581–601.

Dhouailly D. 2009. A new scenario for the evolutionary origin of hair,feather, and avian scales. J Anat 214:587–606.

Ferguson MJ. 1985. The reproductive biology and embryology ofcrocodilians. In: Gans C, Billet F, Maderson PFA, editors. Biology ofthe reptilia. New York: John Wiley and Sons. p. 329–491.

Fraser RDB, MacRae TP. 1976. The molecular structure of featherkeratin. In: Proc 16th Int. Ornith. Congress, Canberra, p 443–451.

Fraser RDB, Parry DAD. 2008. Molecular packing in the feather keratinfilament. J Struct Biol 162:1–13.

Fraser RDB, Parry DAD. 2011a. The structural basis of the filament–matrix texture in the avian/reptilian group of hard b‐keratins.J Struct Biol 173:391–405.

Fraser RDB, Parry DAD. 2011b. The structural basis of thetwo‐dimensional net pattern observed in the X‐ray diffractionpattern of avian keratin. J Struct Biol 176:340–349.

Fraser RDB, Parry DAD. 2012. The role of disulfide bond formation inthe structural transition observed in the intermediate filaments ofdeveloping hair. J Struct Biol 180:117–124.

Fratini A, Powell BC, Rogers GE. 1993. Sequence, expression, andevolutionary conservation of a gene encoding a glycine/tyrosine‐rich keratin‐associated protein of hair. J Biol Chem 268:4511–4518.

Gregg K, Rogers GE. 1986. Feather keratin: composition, structure andbiogenesis. New York: Springer‐Verlag.

GreenwoldMJ, Sawyer RH. 2010. Genomic organization andmolecularphylogenies of the beta (b) keratin multigene family in the chicken(Gallus gallus) and zebra finch (Taeniopygia guttata): implicationsfor feather evolution. BMC Evol Biol 10:148.

Greenwold MJ, Sawyer RH. 2011. Linking the molecular evolution ofavian beta (b) keratins to the evolution of feathers. J Exp Zool316B:609–616.

Gregg K, Wilton SD, Parry DAD, Rogers GE. 1984. A comparison ofgenomic coding sequences for feather and scale keratins: structuraland evolutionary implications. EMBO J 3:175–178.

Hall TA. 1999. BioEdit: a user‐friendly biological sequence alignmenteditor and analysis program for Windows 95/98/NT. Nucleic AcidsSymp Ser 41:95–98.

Hartl M, Bister K. 1995. Specific activation in jun‐transformed avianfibroblasts of a gene (bkj) related to the avian b‐keratin gene family.Proc Natl Acad Sci U S A 92:11731–11735.

Hu D, Hou L, Zhang L, Xu X. 2009. A pre‐Archaeopteryx troodontidtheropod from China with long feathers on the metatarsus. Nature461:640–643.

International Chicken Genome Sequencing Consortium. 2004.Sequence and comparative analysis of the chicken genomeprovide unique perspectives on vertebrate evolution. Nature 432:695–716.

Katsu Y, Bermudez DS, Braun EL, et al. 2004. Molecular cloning of theestrogen and progesterone receptors of the American alligator. GenComp Endocrinol 136:122–133.

J. Exp. Zool. (Mol. Dev. Evol.)

12 GREENWOLD AND SAWYER

Li YI, Kong L, Ponting CP, Haerty W. 2013. Rapid evolution ofbeta‐keratin genes contribute to phenotypic differences thatdistinguish turtles and birds from other reptiles. Genome BiolEvol 5:923–933.

Maderson PFA, Sawyer RH. 1979. Scale embryogenesis in birds andreptiles. Anat Rec 193:609.

Marshall RC, Orwin DFG, Gillespie JM. 1991. Structure andbiochemistry of mammalian hard keratin. Electron Microsc Rev 4:47–483.

Moll R, Franke WW, Schiller DL. 1982. The catalog of humancytokeratins: patterns of expression in normal epithelia, tumors andcultured cells. Cell 31:11–24.

O'Guin WM, Sawyer RH. 1982. Avian scale development. VII.Relationships between morphogenetic and biosynthetic differenti-ation. Dev Biol 89:485–492.

O'Guin WM, Knapp LW, Sawyer RH. 1982. Biochemical andimmunohistochemical localization of alpha and beta keratin inavian scutate scales. J Exp Zool 220:371–376.

Powell BC, Rogers GE. 1997. The role of keratin proteins and theirgenes in the growth, structure and properties of hair. In: Jolles P,Zahn H, Hocker H, editors. Formation and structure of human hair.Basel, Switzerland: Birkhauser Verlag. p 59–148.

Powell BC, Nesci A, Rogers GE. 1991. Regulation of keratin geneexpression in hair follicle differentiation. Ann N Y Acad Sci 642:1–20.

Presland RB, Gregg K, Molloy PL, et al. 1989a. Avian keratin genes, I. Amolecular analysis of the structure and expression of a group offeather keratin genes. J Mol Biol 209:549–560.

Presland RB, Whitbread LA, Rogers GE. 1989b. Avian keratin genes, II.Chromosomal arrangement and close linkage of three gene families.J Mol Biol 209:561–576.

Prum RO, Dufresne ER, Quinn T, Waters K. 2009. Development ofcolour‐producing b‐keratin nanostructures in avian feather barbs.J R Soc Interface 6:S253–S265.

Ronquist F, Teslenko M, Van Der Mark P, et al. 2012. MrBayes 3.2:efficient Bayesian phylogenetic inference andmodel choice across alarge model space. Syst Biol 61:1–4.

Rozen S, Skaletsky HJ. 2000. Primer3 on the WWW for general usersand for biologist programmers. In: Krawetz S, Misener S, editors.Bioinformatics methods and protocols: methods in molecularbiology. Totowa, NJ: Humana Press. p 365–386.

Sawyer RH. 1972. Avian scale development I. Histogenesis andmorphogenesis of the epidermis and dermis during formation of thescale ridge. J Exp Zool 181:365–384.

Sawyer RH. 1983. The role of epithelial–mesenchymal interactions inregulating gene expression during avian scale morphogenesis. In:

Sawyer RH, Fallon JF, editors. Epithelial–mesenchymal interactionsin development. New‐York: Praeger. p 115–146.

Sawyer RH, Knapp LW. 2003. Avian skin development and theevolutionary origin of feathers. J Exp Zool 298B:57–72.

Sawyer RH, Glenn T, French JO, et al. 2000. The expression of beta (b)keratins in the epidermal appendages of reptiles and birds. Am Zool40:530–539.

Sawyer RH, Rogers L, Washington L, Glenn TC, Knapp LW. 2005.Evolutionary origin of the feather epidermis. Dev Dyn 232:256–267.

St. John JA, Braun EL, Isberg SR, et al. 2012. Sequencing threecrocodilian genomes to illuminate the evolution of archosaurs andamniotes. Genome Biol 13:415.

Thompson JD, Higgins DG, Gibson TJ. 1996. CLUSTALW: improving thesensitivity of progressive multiple sequence alignment throughsequence weighting, positions‐specific gap penalties and weightmatrix choice. Nucleic Acids Res 22:4673–4680.

Toni M, Dalla Valle L, Alibardi L. 2007. Hard (beta‐)keratins in theepidermis of reptiles: composition, sequence, and molecularorganization. J Proteome Res 6:3377–3392.

Vandebergh W, Bossuyt F. 2012. Radiation and functional diversifica-tion of alpha keratins during early vertebrate evolution. Mol BiolEvol 29:995–1004.

Vanhoutteghem A, Londero T, Ghinea N, Djian P. 2004. Serialcultivation of chicken keratinocytes, a composite cell type thataccumulates lipids and synthesizes a novel b‐keratin. Differentia-tion 72:123–137.

Warren WC, Clayton DF, Ellegren H, et al. 2010. The genome of asongbird. Nature 464:757–762.

Whitbread LA, Gregg K, Rogers GE. 1991. The structure and expressionof a gene encoding chick claw keratin. Gene 101:223–229.

Wu P, Jiang T, Suksaweang S, Widelitz RB, Chuong C. 2004. Molecularshaping of the beak. Science 305:1465–1466.

Wu D, Irwin DM, Zhang Y. 2008. Molecular evolution of the keratinassociated protein gene family in mammals, role in the evolution ofmammalian hair. BMC Evol Biol 8:241.

Ye C, Wu X, Yan P, Amato G. 2010. b‐Keratins in crocodiles revealamino acid homology with avian keratins. Mol Biol Rep 37:1169–1174.

Zhang Q, Edwards SV. 2012. The evolution of intron size in amniotes: arole for powered flight? Genome Biol Evol 4:1033–1043.

Zheng X, Zhou Z, Wang X, et al. 2013. Hind wings in basal birds and theevolution of leg feathers. Science 339:1309–1312.

SUPPORTING INFORMATIONAdditional Supporting Information may be found in the onlineversion of this article at the publisher's web‐site.

J. Exp. Zool. (Mol. Dev. Evol.)

MOLECULAR EVOLUTION OF ARCHOSAURIAN ‐KERATINS 13