11
J. Mol. Biol. (1989) 209. 549-559 Avian Keratin Genes I. A Molecular Analysis of the Structure and Expression of a Group of Feather Keratin Genes Richard B. Preslandt, Keith Greg&, Peter L. Molloyfj, C. Phillip Morris/l Lesley A. Cracker and George E. Rogers1 Commonwealth Centre for Gene Technology Department of Biochemistry, University of Adelaide Adelaide, South Australia 5001 Australia (Received 15 July 7988, and in revised form, 12 June 1989) The nucleotide sequence of the four complete chicken feather keratin genes A to D contained in the previously isolated recombinant LCFKl has been determined. All four genes have a very similar structure; each gene encodes a polypeptide of 97 amino acid residues and contains an intron in the 5’ non-coding region, 37 base-pairs from the cap site. Comparison of the previously determined feather keratin gene C sequence to genes A, B and D indicates that a high level of gene correction has occurred in the protein coding and 5’ non-coding regions, which show more than 90% homology, whereas the intron and 3’ non-coding regions are by contrast poorly conserved with one or two exceptions. The dramatic conservation of the 5’ non-coding region between the feather keratin sequences and an unrelated but’ co-expressed gene encoding a histidine-rich protein suggests that this segment may play an important role in transcript’ional regulation. In addition, both gene types contain an identically positioned intron in the 5’ non-coding region. Northern blots performed using gene-specific probes show that the four characterized genes A to D plus gene E, which is partially contained in the recombinant XFKl, are all expressed in feather tissue from 14-day old chick embryos. In addition, we report that a scale keratin gene (originally isolated from a scale complementary DNA library) is expressed at a low level in the embryonic feather. 1. Introduction The avian P-keratins expressed in each of these The major gene product synthesized in the tissues comprise a multigene family whose members feather and other terminally differentiating are closely related but are distinct from mammalian epidermal tissues of birds, such as scale and claw, is cc-keratins (Rogers, 1984; Gregg & Rogers, 1986). In an intracellular disulphide-linked protein, keratin. the chicken, the feather keratins are the most complex of the fi-keratin families, comprising about 20 prot’eins, which are co-ordinately synthesized t Present address: Department of Periodontics and during growth and differentiation in the embryonic Oral Biology, University of Washington, Seattle, WA feather (Kemp & Rogers, 1972; Walker & Rogers, 98195, U.S.A. 1976a,b). In addition, the feather and scale keratins $ Present address: Department of Biochemistry. Microbiology and Nutrition, University of New show strong homology between their protein-coding England. Armidale, NSW 2351, Australia. regions and both gene types contain an intron in 5 Present address: CSIRO Division of Biotechnology, their 5’ non-coding regions (Wilton, 1983; Gregg et Laboratory of Molecular Biology. PO Box 184. North al., 1984), strengthening the earlier proposal by Ryde. NSW 2113, Australia. Spearman (1966) that scale genes may have given 11 Present address: Department of Chemical rise to feather genes during vertebrate evolution. Pathology, The Adelaide Children’s Hospital, North We have reported the isolation of the genomic Adelaide, SA 5006. Australia. clone XFKl from a chick ,? Charon 4A library, 7 Author to whom all reprint requests should be using embryonic feather cDNA as a probe (Molloy ef addressed al., 1982). This clone contains four complete feather 549 o(~~--n83s/xs/~o~~~-ll $03.00/O 0 1989 Academic Press Limited

Avian keratin genes I. A molecular analysis of the structure and expression of a group of feather keratin genes

Embed Size (px)

Citation preview

J. Mol. Biol. (1989) 209. 549-559

Avian Keratin Genes I. A Molecular Analysis of the Structure and Expression of a Group of

Feather Keratin Genes

Richard B. Preslandt, Keith Greg&, Peter L. Molloyfj, C. Phillip Morris/l Lesley A. Cracker and George E. Rogers1

Commonwealth Centre for Gene Technology Department of Biochemistry, University of Adelaide

Adelaide, South Australia 5001 Australia

(Received 15 July 7988, and in revised form, 12 June 1989)

The nucleotide sequence of the four complete chicken feather keratin genes A to D contained in the previously isolated recombinant LCFKl has been determined. All four genes have a very similar structure; each gene encodes a polypeptide of 97 amino acid residues and contains an intron in the 5’ non-coding region, 37 base-pairs from the cap site. Comparison of the previously determined feather keratin gene C sequence to genes A, B and D indicates that a high level of gene correction has occurred in the protein coding and 5’ non-coding regions, which show more than 90% homology, whereas the intron and 3’ non-coding regions are by contrast poorly conserved with one or two exceptions. The dramatic conservation of the 5’ non-coding region between the feather keratin sequences and an unrelated but’ co-expressed gene encoding a histidine-rich protein suggests that this segment may play an important role in transcript’ional regulation. In addition, both gene types contain an identically positioned intron in the 5’ non-coding region.

Northern blots performed using gene-specific probes show that the four characterized genes A to D plus gene E, which is partially contained in the recombinant XFKl, are all expressed in feather tissue from 14-day old chick embryos. In addition, we report that a scale keratin gene (originally isolated from a scale complementary DNA library) is expressed at a low level in the embryonic feather.

1. Introduction The avian P-keratins expressed in each of these

The major gene product synthesized in the tissues comprise a multigene family whose members

feather and other terminally differentiating are closely related but are distinct from mammalian

epidermal tissues of birds, such as scale and claw, is cc-keratins (Rogers, 1984; Gregg & Rogers, 1986). In

an intracellular disulphide-linked protein, keratin. the chicken, the feather keratins are the most complex of the fi-keratin families, comprising about 20 prot’eins, which are co-ordinately synthesized

t Present address: Department of Periodontics and during growth and differentiation in the embryonic Oral Biology, University of Washington, Seattle, WA feather (Kemp & Rogers, 1972; Walker & Rogers, 98195, U.S.A. 1976a,b). In addition, the feather and scale keratins

$ Present address: Department of Biochemistry. Microbiology and Nutrition, University of New

show strong homology between their protein-coding

England. Armidale, NSW 2351, Australia. regions and both gene types contain an intron in

5 Present address: CSIRO Division of Biotechnology, their 5’ non-coding regions (Wilton, 1983; Gregg et

Laboratory of Molecular Biology. PO Box 184. North al., 1984), strengthening the earlier proposal by

Ryde. NSW 2113, Australia. Spearman (1966) that scale genes may have given

11 Present address: Department of Chemical rise to feather genes during vertebrate evolution. Pathology, The Adelaide Children’s Hospital, North We have reported the isolation of the genomic Adelaide, SA 5006. Australia. clone XFKl from a chick ,? Charon 4A library,

7 Author to whom all reprint requests should be using embryonic feather cDNA as a probe (Molloy ef addressed al., 1982). This clone contains four complete feather

549 o(~~--n83s/xs/~o~~~-ll $03.00/O 0 1989 Academic Press Limited

550 R. B. Presland et al.

keratin genes and the 3’ end of a fifth, which are evenly spaced about 3 kbt centre-to-centre and have the same transcriptional orientation (see Fig. 1). The tandem arrangement of these genes suggests that they evolved from a single ancestral gene by duplication and subsequent sequence diver- gence. In many multigene families, gene correction mechanisms such as gene conversion (Baltimore, 1981; Jackson & Fink, 1981; Klein & Petes, 1981) have operated to preserve sequences in the coding and non-coding regions that are important for the structure of the encoded proteins and the regulation of gene expression, respectively. In order to identify sequences in the feather keratin genes that may be important for these processes, we have sequenced the four complete feather keratin sequences (genes A to D) represented in the recombinant LCFKl. We report here that the salient structural features described previously for feather keratin gene C (Molloy et al., 1982) are common to all four genes and that there is a remarkable degree of sequence conservation in both the protein coding and some parts of the non-coding regions. All four feather keratin genes appear to encode a functional gene product and, indeed, t.he four sequenced genes plus gene E, which is partially contained in XFKl, are all expressed in embryonic feather tissue. In addi- tion, we have found that a scale keratin gene iden- tical with one previously isolated from a scale cDPu’A library (Wilton et al., 1985) is also expressed in the chick feather.

2. Materials and Methods (a) Che~micals and enzymes

Restriction enzymes were purchased from New England Biolabs or Boehringer-Mannheim and digestions were performed under the conditions specified by the manufacturers. Calf intestinal phosphatase was obtained from Boehringer-Mannheim. DNase I, DNA polymerase I, Klenow enzyme, polynucleotide kinase, phage T4 DNA ligase, [a-32P]dNTPs and [Y-~‘P]ATP were purchased from Biotechnology Research Enterprises of South Australia (Bresatec). Agarose was obtained from BRL and deionized glyoxal was a gift from Dr Jim McInnes. Nitrocellulose was purchased from Schleicher and Schiill (BA85, 645 pm pore size).

(b) Keratin gene isolation and sequencing

The isolation and characterization of the chick genomic clone lCFK1 and the generation of chimeric plasmids containing each of the Hind111 fragments that comprise the ICFKl insert have been described by Molloy et al. (1982). Each of the pBR322 subclones containing genes A, B and D were mapped using both 4- and B-base restriction enzymes. The DNA fragments were subcloned into appro- priate Ml3 vectors (Messing & Vieira, 1982; Norrander et al., 1983) and then sequenced by the dideoxy chain termi- nation method (Sanger et al., 1977, 1980) using the kit provided by Bresatec.

t Abbreviations used: kb. lo3 base-pairs, bp, base- pairs: HRP. histidine-rich protein: cDNA, complementary DN$.

DNA sequence data was compiled and analysed on a VAX 11-785 computer using the ANALYSEQ programs of Staden (1984a,b) and the IDEAS programs of Kanehisa (1982, 1984).

(c) RNA preparation

Total cellular RNA was isolated from the feathers of 14-day old embryos of White Leghorn chickens (Callus domesticus) by a modification of the method of Brooker et al. (1980). Feather tissue plucked from about 20 embryos was washed twice in isotonic saline to remove any blood and cell debris and then homogenized in 50 ml of ice cold 6 M-guanidine HCl, 0.2 M-sodium acetate (pH 5.2) 10 miw-fi-mercaptoethanol using a motor-driven Potter- Elvehjem Teflon homogenizer. Two vol. pre-chilled ethanol were added to the homogenate and, after standing for at least 2 h at -2O”C, the mixture was centrifuged at 12,000 revs/min for 20 min at 4°C to recover the precipi- tate. The pellet was resuspended in about 30 ml of 6 Y- guanidine . HCl, 62 M-sodium acetate (pH 5.2). 10 m&r-EDT$ and again precipitated with ethanol as described above. The pellet was resuspended in 30 ml of freshly made 7 M-urea, 0.1 M-Tris. HCl (pH 85). 61 mM-EDTA, 0.1 yc (w/v) SDS and extracted twice with an equal volume of phenol/chloroform (1 : I), once with chloroform and then ether and finally precipitated wit’h ethanol.

To remove high molecular weight DNA from the prr- paration, a LiCl precipitation was performed according to the procedure of Diaz-Ruis & Kaper (1978). RNA concen- trations were estimated by measuring the absorbance at 260 nm and samples assessed for their integrity by agarose gel electrophoresis.

(d) Preparation of oligonucleotide probes

Oligonucleotides (20.mers) were synthesized by Bresatec using an Applied Biosystems Model 3808 DNA Synthesizer. The oligonucleotides were designed from highly divergent portions of the 3’ non-coding region of each gene so as to be gene-specific. The sequences comple- mentary to those underlined in Fig. 2 were chosen so as to hybridize to mRNA transcripts specific for that gene. The gene E oligonucleotide had the following sequence: 5’ CTGGCTAGCAACTCCATCCT 3’.

The oligonucleotides (01 to 05 pg) were labelled with [Y-~~P]ATP using polynucleotide kinase (Maniatis et al.. 1982). The products were fractionated on a 20% poly- acrylamide gel and the kinased DNA excised and eluted overnight at, 37°C in 63 M-sodium acetate. 0.1 0/0 SDS.

To ensure t,hat the oligonucleotides were gene-specific, they were compared against all known chick keratin sequences prior to synthesis using the SEQH program Kanehisa, (1982, 1984) and t,he labelled oligonucleotides blotted to HindIII-digested iCFK1 DNA and cosmid clones containing a large proportion of the genes in the feather keratin gene family (Presland et al.. 1989). All 5 oligonucleotide probes were found to be specific for the Hind111 restriction fragment containing that gene (data not shown).

(e) Northern blot analysis

Glyoxylated RNA samples (5 to 10 pg) were fractionated on 1.2% horizontal agarose gels and trans- ferred to nitrocellulose exactly as described by Thomas (1983), except that DMSO was omitted from the denatu- ration step. Prior to hybridization. the filters were washed

Avkn Keratin Genes 551

in boiling 20 m,n-Tris. HCl (pH 8.0) for 5 min to remove residual glyoxal (Thomas, 1983).

The Northern filters were hybridized with the gene- specific oligonucleotide probes at 42°C’ as follows. The filters were prehybridized for at least 2 h in 6 x NET (1 x h‘ET=O.l5 M-SaCl. 15 rnM-Tris HC!l (pH 7.5). 1 mM-EDTA). 5 x Denhardts (!Naniatis it nl.. 1982), O.f,(l/;, SP-40 (Shell Chemicals) and 200 pg salmon sperm DNA/ ml. Hybridizations were carried out for 16 to 24 h in 6 x NET. 1 x Denhardts, 05y$ KP-40 and 100 pg salmon sperm D?;A/ml. The oligonucleotides were denatured for 10 min at 65°C before setting up the hybridizations. The filters were washed at high stringency (2 x WC’. O.l”;, SDS. 65°C:) and aut’oradiographed at -XOYl (1 x SSC’ is 0.15 M-NaC’l. 15 mM-sodium citrate (pH 74)).

L1s an internal control experiment. the blots were washed to remove t,he oligonucleotide probes and the filt,ers re-hybridized with a chick j%actin cDh’A clone (C’leveland rt ul.. 1980). The iiltrrs wpre prehybridized and hybridized as described for Southern blots (Wahl et al.. 1979: Meinkot)h HL Wahl, 1!184). washed at, high st)ringency (01 x WC‘. 0.1 O;, SDS. 65°C’) and autoradiographed at -80’2(‘.

3. Results

(a,) SecluevLce analysis of th.p feather keratin gerbes

The genomic clone X:FKl contains four complete feather keratin genes (genes A to D) as well as the 3’ non-coding region and 50 k)p of the protein-coding region of a fift’h gene (gene E). Figure 1 shows an EcoRl and EiindTTl restriction map of this clone (revised from Molloy ef al.. 1982), indicating the position of the five genes. Restriction mapping data and sequence analysis of all five sequences demon- strated that they are evenly spaced at about 3 kb cent)re-to-centre and have the same transcriptional orientation.

Figure 2 shows the nucleotide sequence of feather keratin gene (1 described by Molloy et al. (1982) aligned against genes D. B and A. All of the com- pletely sequenced feather keratin genes from the clone 1CIFKl have the same general structure. The possible significance of the observed DNA sequence conservation in the different parts of the genes is discussed below.

* EH H H E H E EEH H H

**w-a A B C II E

Figure 1. Arrangement of the feather keratin genes in the genomic clone lCFB1 (revised from Molloy et nl., 1982). The exact position of genes A to E was derived from the DNA sequence data with the position of genes A. D and E being changed in relation to the restriction sites shown. The position of the EcoRT (E) and HirLdIII (H) restriction enzyme sites are indicated. The EcoRI sites at each end of the clone (E*) were added to the chick genomic DNA during the ronstruct,ion of the library (Dodgson et nl., 1979).

(i) 5’ Flanking and non-coding regiom

Each of the genes contains a TATA box at about) -30 relative to the cap site, although gene C has t,he slightly altered sequence CATA, as do some CX- and /I-globin genes (Efstratiadis it al., 1980). The TATA sequence has been implicated in determining the rate and accuracy of transcriptional initiation (Ruratowski et al., 1988) and the substitution in gene C of the first T residue with a C’ is of uncertain significance. Only t’wo of t,he feather keratin genes in the A recombinant (genes A and B) have been sequenced upstream from the TATA box and these data have been present’ed and discussed elsewhere (Gregg & Rogers, 1986). However. it should be noted here that, while there is considerable DNA sequence homology between the two genes (e.g. 69”/0 between -300 and the cap site). it’ is difficult to determine without the appropriate in vitro mut’a- genesis studies whet,her the observed homology is a result of tandem gene duplicaGon or due t’o conser- vat’ion of sequences that are important for tissue- specific expression of the keratin genes.

The feather keratin genes have a 5’ non-coding region of either 58 bp (genes 13 to D) or .59 bp (gene A). Tn all four genes, t,his sequence is intrr- rupted by an intron that lies between bases 37 and 3X of the mature transcript, (Fig. 2). The sequence of the 37 bp 5’ non-c>oding exon is rigidly conserved. with only one base change t,hroughout) the four known sequences (Fig. 2). The remaining 21 bp (genes K to T)) or 22 bp (gene A) between the 3’ splice site and the ATG initiation codon are less strong]? conserved, with an average of only SOY;; similarity among all four sequences.

The sequences of t,he feat)her kerat,in genes immediately upstream from the initiation codon (5’ A/GCX!ATG 3’) fit the consensus sequence for efi- cient translational initiation described by Kozak (1986). Of t’he bases immediat’el,v preceding the initiation codon, a purine three nucleotides upstream is thought to be the most crucial.

(ii) Zntrou

The intron in the feather keratin genes varies in size between 324 bp (gene (1) and 341 bp (gene A) (Fig. 2). There are two splice junctions in each gene of the sequence 5’ AGGT 3’ (Breathnach 8r Chambon, 1981), and the strict conservation of the DNA sequence around splice sites extends inside the intron by 9 bp at the 5’ end and 14 bp at the 3’ end (Fig. 2). Heyond these points t’here is a drop in sequence similarities although there are several small blocks of sequence, particularly in the middle portion of the intron, which show 90 to lOO?b conservation between the four genes (e.g. bases 1X8 to 192, 206 to 225, 283 to 295: Fig. 2). The func- tional significance of these sequences. if any. is not known. An additional sequence eletnent known as t,he 3’ splice signal, which is involved in pre-mRNA splicing (‘lariat’ formation), has been identified in a wide variety of rert’ebrate genes (Keller & Soon. 1984). This element) (C’rPuAPv. where Pu is a

552 R. B. Presland et al.

Gene C Gene D Gene 6 Gene A

Gene C Gene D Gene B Gene A

Gene C Gene D Gene B Gene A

Gene C Gene D Gene B Gent A

Gene C Gene D Gene B Gent A

Gent C Gene D Gene B Gene A

Gene c Gene D Gene B Gene A

TATA Box Cap Site 5’ Splice Site GCATAAATTA--ACTCTA-AACCAGGCTCCCTCATCCACTTCTCTTGCCTTCTCCTCCTTGGTGAACA~GAGCTGCTGT-GGCTTTC

T AGG-- G C G- A - CT AGCCC-G C ACT T G G A C-A C GT AGGCC G. C GG GCAT TGA

. . . . . . . . 90

TTCTCA----- CTCTTGCTCTGCrrCTCCTCTTGCTCTTTGTGCCTCCATTAAGTCT~GCCTTTGTGCA----GACCTTGGTATGAGCC C __-__ CC CTG C G C A ----AC T G C TGCTCTT A A--- T C CCT TCG C CC CT TGTA GG G T

T A CTTTT GCAA T T T CTG AT C G CT A T---- A AG . . . . . . . . 180

TT-GCT-CCTTGTT-GTGCTCCCA-CCAGCTCTGTTCTGCTTGCCAGTG--TGAGG-----TGTGGTGGGAGAAGGCCTTTGGCTGACTC _ _ AC CA -- G CGA----- A T CAG T

; T A - C __ G GGAGGA G A A T AG ET TG T TCTC T A TG TG GG---A G T : AC

. . . . . . . . 270

TACTGGTGATT-CTGGGGATCTGGGC-TCGCCAAGCCTCTGT----CCCTGCTCCGCCTGTCCCTGATCCCCGGCCTGTGCCT-CCTGTC GCT GTGC A --TA T A CACTCT A ATAT C T T ATT - T CTACT

G G C C- - AC-CT T A T C CTAT T CA AT C -TC G C T C-T GGCT AC CTGG A AATA A A ACTG ACA-- -TC

. . . . . . . . 360

TC A T-A G _- ATG G - TTG C AGT --G TCG AAT T G

. .

3’ Splice Site Met --TTGGCAATCAGG-CCACCAGGCTACTTCTCACGCAGCCTGTGCTTTTCTTGTCCTCTCTCC~CTACTCCCATCCTA-CAGCCATG TC A T-A G GC -ATAAT A- -C C CT C - CA _- ATG G - TTG C AGT G T T TC GTC G CT --G TCG AAT T G G TTTG T G-C C CT C A CA

. . . . . . . . 450

Ser Cys Phe Asp Leu Cys Arg Pro Cys Gly Pro Thr Pro Leu Ala Am Ser Cys Am Glu Pro Cys Val A.rg Gin Cys Gin Asp Ser &g TCCTGCTTCGATCTGTGCCGTCCCTGTGGCCCGACCCCACTGGCCAACAGCTGCAACGAGCCCTGTGTGCGCCAGTGCCAGGACTCCCGG

G T AT A T AT CC CA G T

. . . . . . . . 540

Val Vd Lie Gin Pro Ser Pro Val Val Val Thr Leu Pro Gly Pro Ile Leu Ser Ser Phe Pro Gin Am Tbr Ala Ala Gly Ser Ser Thr GTGGTGATCCAGCCCTCTCCCGTGGTGGTGACCCTGCCTGGACCCATCCTCAGCTCC~CCCCCAGMCACCGCTGCGGGCTCCAGCACC

C GCTG T T C G TC

T . . . . . . . . 630

Gene C Ser Ala Ala Val Gly Ser Ile Leu Ser Glu Glu Gly Val Pro Lie Ser Sex Gly GlyPhe Gly Ife Ser Gly Leu Gly Ser ArgPhe Sex

TCTGCTGCTGTTGGCAGTATCCTGAGTGAGGAGGGAGTGCCCATCTCCTCTGGTGGCT~GGCATCTCTGGCCTGGGTAGCCGCTTCTCT Gene D C C C Gene B c cc T T G C Gent A C C A C T

. . . . . . . . 720

Gly Arg Arg Cys Leu Pro Cysstop Gene C GGCAGGAGGTGTCTGCCCTGCTAAGGACGAGGTGTTCATCCCATGGATGCAT--CCTCAGGAMCCCAAAGCTTGGTGCTGGACT--GCT Gene D T GA GC ACCC- T-G C A G AGGT ACA ACTGG Gene B T GA A CC T-G C G CA CAT T% Gene A A C A T GA TG A ATC C G TCA -- Ai

. . . . . . . . Ti3

Fig. 2.

Avian Keratin Genes 553

Gene C Gene D Gene B Gene A

Gene C Gene D Gent B Gene A

Gene C Gene D Gene B Gene A

Gene C Gene D Gene B Gene A

Gene C Gene D Gene B Gene A

~ACTGAGCTTCTGAGCAGGATCCA---------------- CTGAGCACCCTCCTGCTCT-CCTGCAAAGCAAAGAGGGAATTCAA-GTTG CTA CCAG A TTC TT------------------GGA ACT T TG C G -

---- -- -

GA T(:

IbLI (rb b tC TC TT--------CCAGATGTG T T TG AC TT - : GCTGG- A

T GCTT CT TTTCAGTTTCCCAGATAGG T T T TG AC T G TGC G- . . . . . . . 900

CCAGCCTGTGCTGCCTGTAGACACAGACAGCAGCTGTCTTCTTCTTTTCC~CTTTCTT-TCATCATTAGGGGTTC~GTTGTGTCC-TT TGTTTCC TT G A TGAC AA ______ CA A C CACA C A T TGC

T T GA TG C A ___ C G C C CAA AC CA C TGC T -GA T C AT C CA C CATCCTTGAT GA CTCC TGAC

. . . . . . . . 990

TGTCCTGTGCCCTGGGTTCATCCTGAAGCAAGTTGAGATGGCCCTGCTTC~CTTCCAC~GTCT-TGTG-A-TGGGGAAGACATGCATC C -A G -_ TG C CGCCA T - A A A- TGCT

A TG T G T CG T-CT C- - GG A TG A T TATGA A A CA TGC - T-C A T-CAC A- - GG A T G

. . . . . . . . 1080

CCATCTTCCTGTAGTTT-CCCTCCTTATGGCCAATATG~~TGCCAGCTGTATTTCTAGCAGCCTTTA-ACTTT-CACTTT--~TGATTC T TCA GTG GAA GTGTG TGAG A- CGC--T-CCCT TA CAGCG GGC ATCCCTTT AGAAT G GC C T T T GA AT -G AGGAA T C ATT A C

GA--TGCCCTAG AGCAACC C C C - A--GACGC T

TG GT A G CTAT A--GAG T- CT” AG --G AA CC . . . . . . . . 1170

Poly (A) Signal Pdy (A) Addit’m Site CTTCTTTGAGCTCAATAAAATTTATGCTGCATTGTAATCTCAGTCTCCTCATGTTTCTTG

A CA A C T T A TTG C C TG GA G A TG T TA CA A A T T CA TTG C C

. . . . . 1230

Figure 2. Aligned nucleotide sequences of the 4 complete feather keratin genes from the clone iCFK1. The complete nucleotide sequence of gene C (Molloy et al., 1982) is shown at the top of each line, while the sequence of the other 3 genes is presented only where they differ from gene C. The feather keratin protein encoded by gene C is shown above its nucleotide sequence including the Met initiation codon. Gaps were introduced to maximize sequence homology among the genes. Dots are included at lo-base intervals. The following features are underlined in the DNA sequence(s); TATA box (2 to 5): cap site (34), .5’ splice site (69 to 72), 3’ splice signal (390 to 400). 3’ splice site (424 to 427), gene-specific oligonucleotide probes (805 to 830). poly(A) signal (1184 to 1189) and poly(A) addition site (1211).

purine and Py is a pyrimidine) is present in all four feather keratin genes 25 to 30 bp upstream from the 3’ splice site (underlined in Fig. 2), albeit in a slightly altered form in some cases.

(iii) Protein. coding segioa

The protein coding regions of the four character- ized genes are all precisely 297 bp long, including the ‘start’ and ‘stop’ (TAA) codons, producing poly- peptides of 97 amino acid residues that are remark- ably alike (Figs 2 and 3). Third base preference is stronglv biased towards C (50+3%), wit’h G and T at approximately random frequencies (25.5 o/o and 20.1 “/b. respectively) and with A residues severely limited (3.6%). Table 1 shows the number of silent, and replacement base changes that have occurred between the genes. The average coding sequence similarity can be calculated as 93.6%, which corres- ponds with amino acid conservation (average) of 95O/,.

(iv) <?’ Non-coding region The 3’ non-coding regions range in length from

431 bp (gene 1)) t’o 450 bp (gene A) and over most of

that length show a comparatively high degree of sequence divergence, apparentSly even more than is observed in the intron (Fig. 2). However, there are two portions of the 3’ non-coding region that are very well conserved between these genes: a region of about 50 bp around the polyadenylation signal and an 18 bp sequence in the middle of the 3’ non-coding sequence. The five genomic 3’ non-coding sequences isolated in the clone lCFK1 and t’he feather keratin cDNA clone pCFK23 (Morris, 1984) all contain the AATAAA and CAPyTG motifs (underlined in Figs 2 and 4), which are required for the correct S/-end processing and polyadenylation of eukaryotir mRNA. These two sequence elements are comple- mentary to regions of small nuclear RNA U4 and it has been proposed that this molecule is involved in correct 3’ terminus formation (Berget, 1984; Gil & Proudfoot, 1984). In all feather keratin genes sequenced to date, the 5’ A residue of the polyade- nylation signal is thought to be 27 bp from the point of polyadenylation, as judged by the excellent homology between the five genomic sequences and the cUNA clone pCFK23 (Fig. 4). However, that precise point may be slightly equivocal because

554 R. B. Presland et al.

97 Gene C SBR SBR GLY GLY PHB GLY ILE SER GLY LEU GLY SER ARG PHE SER GLY ARG ARG CYS LEU PRO CYS

D B CYS A SBB TYR

Adult --- TYB PRO

Figure 3. Predicted amino acid sequences of feather kerat,in proteins encoded by genes A to 1) of ICFKI compared to an adult chicken feather keratin protein (Arai et ~1.. 1983). Amino acid positions in the other 4 sequemes that vary from the llrotein sequence derived from gene ( are shown in bold. The dotted line represents an apparent deletion or insertion difference between adult and embrvonic nroteins. Arrows represent a segment that could form a regular /?-sheet csonfommation (revised from Rogers, i984). ’

there is an A residue in the gene sequencrs immedi- ately adjacent, to what is assumed to he the terminal TC’ dinucleotide. A third pyrimidine-rich sequence (YGTGTTYY) reported to be important for effi- cient formation of 3’ termini in herpes simplex virus genes and subsequently observed in a large number of eukaryotic genes (McLauchlan et al., 1985) is present in the feather keratin genes (underlined in Fig. 4). This, and other, pyrimidine-rich sequences are likely to be involved in the cleavage and processing of nuclear RNAs that extend beyond the poly(A) site (see McLauchlan et al., 1985).

The other conserved portion is an 18 bp sequence t’hat lies near the middle of the 3’ non-coding region (bases 1010 to 1027; Fig. 2). No function has been established for this sequence although its strict conservation strongly suggests that it does have some function.

region of each of the five genes (underlined in Fig. 2. see also Materials and Methods). A series of embryonic chick feather RNA Northern blots were hybridized with these probes (Fig. 5(a)). All five genes are expressed in the embryonic chick feather, albeit at somewhat varying levels. No hybridization was seen to an equal amount of chick liver RNA by any of the probes (data not shown). As a positive control to check that each filter had an equivalent amount of RNA, the oligonucleotide probes were washed off and the filters rehybridized with a chicken /?-a&in cDNA clone (Cleveland et al., 1980) t’hat is expressed at a constant level (Fig. 5(b)). Densitometric scanning of the Northern auto- radiograms indicates that transcript levels of the individual feather keratin mRNAs vary by ‘to-fold.

(b) Expression of the ICIFKl keratin genes in the chick embryo

(c) E.xpression of a scalp keratin gene in the embryonic fcvzther

Grne-specific! oligonucleotides were prepared from poorly conserved portions of the 3’ non-coding

Sequence analyses of a number of cDNA clones isolated from a feather cDEA library constructed by Saint (1979) indicated that’, besides containing a

7 25 Gene C SER CYS PHE ASP LEU CYS ARG PRO CYS GLY PRO THR PRO LEU ALA ASN SBR CYS ASN GLU PRO CYS VAL ARG GLN

D ALA B TYR A TYR S8R ALA

Adult

50 Gene C CYS GLN ASP SER ARG VAL VAL ILE GLN PRO SER PRO VAL VAL VAL THR LEU PRO GLY PRO ILE LBU SER SER PHE

D B A

Adult

v 75 Gene C PRO GLN ASN THR ALA ALA GLY SER SER THR SER ALA ALA VAL GLY SER ILE LEU SER GLU GLU GLY VAL PRO ILE

tl LEO VAL B VAL GLU A VAL

Adult VAL

Table 1 Xucleotide sequence variability among the coding portions of feather keratin genes

(kw?s Silent Replacement Total 9; change

A/l3 17 8 2.5 84 A/C’ 8 9 17 .57

A/D 10 12 22’ 74

H/C’ 10 6 16 54

13/T) 1 3 8 21 7.1

c’/n 7 5 I” 1fb

Mcao 11 8 19 6.33 * .

Th c protein coding regions of feather keratin genes A to D (Fig. 2) were compared in pairwise combinations and the number of base changes, both silent (leading to no amino acid change) and replacement (leading to an amino acid change). arc tabulated.

A&an Keratin Genes 555

Poly( A) Signal v Gene C AAT-TTTATGCTGCATTGTAATCTC AGTCTCCTCATGTTTCTTG

Gene D T T CA T TG c c

Gene B G A TG T

Gene A T T CA T TG c c Gene E G GC T TGTG TC T

****** *********************

pCFK23 AAT-TTTATGCTGCATTGTAATCTCTC AAARAAAAAARAAAAAAAA

Figure 4. Comparison of the 3’ ends of the sequenced feather keratin genes with the cDNA clone pCFK23. The nucleotide sequence at the 3’ end of gene C is shown, while the sequence of the other genes is displayed only where they differ from gene C. Sequence identity between 3 or more genes with pCFK23 (Morris, 1984) is indicated by stars (*). The arrow shows the presumed point of polyade- nylation of each of these mRNAs. The sequences under- lined are those consensus sequences thought to be asso- ciated with 3’ terminus formation (see the text’).

considerable number of feather keratin and HRP (fast protein) cDNA clones (see Discussion), the library also contained at least one scale cDNA clone. This scale clone, pCFK15, was identical with the scale cDNA clone, PCSKIZ, isolated from a scale cDNA library by Wilton et al. (1985), except that they had different 5’ termination points due to incomplete cDNA copying of the scale mRNA. To demonstrate directly that scale keratin sequences were present in embryonic feather mRNA, a Northern blot of total embryonic feather RNA was probed with a coding fragment from pCFK15, i.e. the 39 bp repeat sequence that is specific for scale keratins ((Gregg et al., 1984; Wilton et al., 1985).

B C D E

Figure 5. Northern blot analysis of embryonic feather RNA using probes specific for each of the feather keratin genes from the genomic clone ICFKl. (a) Total RPU’A (5 pg) isolated from 14.day old embryonic feather tissue was denatured with glyoxal, fractionated on a 1.2% (w/v) agarose gel, transferred to nitroeellulose filters and hybridized with oligonucleotide probes specific for each of genes A to E (see Materials and Methods). (b) The blots were stripped with 0.1 x &SC, @l y0 SDS at 100°C to remove the hybridized oligonucleotides and re-probed wit’h a chick /I-artin cDNA clone (see Materials and Methods) to ensure that an equal amount of RNA was used in each experiment. The feather keratin and B-actin DNA probes hybridized to mRNAs of the sizes expected; that is. ti.50 bp and 1.9 kb. respectively.

s F

- 850 bp

Figure 6. Detection of sequences complementary to a scale-specific probe in embryonic feather mRNA. The Northern blot was performed exactly as described in the legend to Fig. 5 and the duplicate filters hybridized wit’h: S. the 39 bp coding fragment from pCFK15 (resected with Sau96; Wilton et al., 1985) that contains the Gly-Gly-X region specific for scale keratins; and F, a restriction fragment from the 3’ non-coding region of gene R that specifically detects feather keratin mRNA of 850 bp.

Figure 6 shows that the scale-specific probe easily detected a mRNA that was marginally longer than the embryonic feather keratin mRNA of about 850 bp, a result consistent with the expected size of scale keratin mRNA (Wilton, 1983).

4. Discussion (a) Analysis of the keratin genes qf EFKI

We present here the sequence characterization of four chick feather keratin genes contained in the recombinant EFKl. The tandem arrangement of the genes in XFKl and their almost equidistant spacing along the DNA (Fig. 1) indicates that the feather keratin gene family arose by a recombina- tion mechanism such as unequal crossing-over

556 R. B. Presland et al.

Cap Site lnltlatlon Codon Gene C ATCCACTTCTCTTGCCTTCTCCTCCTTGGTGAACAAGIGTCTACTCCCATCCTA-CAGCCATG Gene D - C CT C-CA - Gene B G G CT Gene A C CT A A CA

** * ******* * *** ***** ******* * ** it***** * * ****

FP ATTCGCTTCTCTCGAGTTCCTCTCCTCGGTGAACCGG GTTTCCCTCCAACA-ACCAGCAATG - Figure 7. Comparison of the 5’ non-coding regions of feather keratin and fast protein (HRP) genes. The nucleotide

sequence of the 5’ end of gene C is shown, while the sequence of the other genes is displayed only where they differ from gene C. Sequence identity between 2 or more of these genes with the sequenced fast protein (Fp) or HRP gene (Morris, 1984) is indicated by stars (*). The arrow marks the position of the single intron in the 5’ non-coding regions of these 3 gene families. The cap site and ATG initiation codons are underlined.

(Smith, 1976) estimated to have occurred in the last 120 million years (Gregg et al., 1984). Additional evidence has been presented, in the accompanying paper, that supports our contention that unequal crossing-over has been an important mechanism in the generation of this gene family (Presland et al., 1989). Subsequently, the genes have been allowed to diverge to some extent, particularly in the intron and 3’ non-coding regions, which presumably have less functional significance than the other parts. Thus, gene correction mechanisms such as gene conversion (Baltimore, 1981; Jackson & Fink, 1981; Klein & Petes, 1981) must have acted on the protein coding and 5’ non-coding regions to maintain sequence homogeneity, while allowing the other parts to diverge to some extent.

The most striking example of gene correction amongst the feather keratin genes is the 37 bp 5’ non-coding exon which is more conserved than even the protein coding region (Figs 2 and 3). The remaining 21 to 22 bp that lie 3’ to the splice site are less conserved (80% ejersfcs 99% for the 5’ exon; Fig. 7), implying that this sequence may have a less important role than the preceding part. The 5’ non- coding region may play a role in either transcrip- tional and/or translational control of feather keratin gene expression. We believe that the production of such large amounts of feather keratin and histidine- rich protein (HRPT) mRNAs de mwo (in the case of

t The HRPs or fast proteins comprise a family of 3 to 5 proteins. 119 amino acid residues in length. that are co-ordinately expressed with the etnbryonic feather keratins and comprise between 10 and 40’3, of t’otal embryonic feather protein (Walker & Rogers, 1976a; Morris, 1984; Rogers, 1985). The HRP sequence deduced from DNA sequence analysis shows no amino acid homology to t’he feather keratins but does show some homology to the C-terminal Gly-Gly-X portion of scale keratin (15/43, data not shown), supporting the proposal by Morris (1984) t,hat HRPs and keratins may have a common ancestry that pre-dated the scale to feather transition and presumably occurred just before the appearance of the first-known feathered creatures some 170 million years ago (Ostrom, 1976). The function of the HRPs is unclear but they are thought to act as an interfilamentous matrix or “glue” to bind the /?-keratin microfibrils together, much like the intermediate filament-associated proteins do for the keratin intermediate filament proteins of hair and skin (Powell & Rogers, 1986: Dale et al., 1989).

the feather keratins, more than 3 x lo5 molecules per cell at day 15 of embryonic development (Powell et al., 1976)) would “swamp” the pre-existing mRNAs encoding housekeeping and other func- tions, and therefore there would not be a require- ment for selective translation of these two families of mRNAs. This prediction is borne out by the study of Powell et al. (1976), which indicates that the overall levels of keratin protein synthesis in the feather are paralleled by protein levels. Therefore, we favour the alternative hypothesis that this 5’ sequence plays a role in transcriptional regulation. In relation to this, analysis of one of the HRP genes (which are co-ordinately expressed with the feather keratins) has shown that it has a 5’ non-coding region that is remarkably similar in both length and sequence to that of the feather keratins (Fig. 7). This 5’ non-coding sequence may be the site of interaction of a transcriptional signal derived from the developing (unkeratinized) epidermis or t’he underlying dermis.

The protein-coding regions of the four complete feather keratin genes show greater than 90% homo- logy at both the DNA and protein levels and have been demonstrated to encode feather keratin proteins on the basis of the strong homology of the protein encoded by gene C (and the other 3 genes) wit’h embryonic feather keratin carboxy and N- terminal peptide sequences (Walker & Rogers. 19766; Molloy et al., 1982) and the published adult fowl feather protein sequence (Fig. 3). The gene- derived protein sequences are very similar to t’hat of the sequenced adult fowl protein (Arai et al., 1983). wit’h the major difference being the deletion of a single arginine residue from the adult protein near its (1 terminus (Fig. 3).

The conservation of t’he protein-coding region is presumably related to protein function such as the requirement for the central portion of the molecule (residues 24 to 66) to form a p-pleated sheet (Gregg et al., 1984). Indeed, this portion is generally conserved in other chick keratin families, such as the scale and feather-like genes, as well as in feather proteins of other bird species such as the flightless emu (for comparisons, see Gregg & Rogers, 1986; Presland et al., 1989). However, it is clear that mutations have occurred in these multiple genes and that some have been acceptable in the amino acid sequence that forms the anti-parallel p-sheet

Avian Keratin Genes 557

structure of the feather microfibril (Fig. 3; Fraser et al., 1972). This is very apparent when the coding regions of the four sequenced genes are compared, which demonstrates that there is not a very strong bias toward silent base changes over replacement changes (Table 1). Because the proteins are very similar, it could be argued that the loss of one protein out of a family of about 20 (Presland et al., 1989) should not reduce the ability of the animal to produce a functional feather. However, the signifi- cant alteration of a single protein that might never- theless be incorporated into the epidermal structure, could disadvantage the organism significantly. Thus, alterations within the coding region may be selected against more rigorously than the complete deletion of a functional gene product. It is of interest that for each amino acid codon in the four coding regions, the third base preference is identical in most cases (Fig. 2). Similar observations have been made in studies on a number of multigene families (Slightom et al., 1980; Shen et al., 1981; Krawinkel et al., 1983; Powell et al., 1983; Iatrou et al., 1984) and is probably a result of gene conversion between homologous members within a family. Tt is worth noting that, while gene conversion can homo- genize dispersed genes as well as those arranged in tandem, this mechanism is predicted to be less efficient for dispersed gene families (Baltimore, 1981; Jackson & Fink, 1981: Shen et al., 1981).

The intron sequences have drifted to some degree since the duplication of these genes, except for several segments around the splice junction and near t#he middle of the intron which show 90 to 100% conservation (Fig. 2). While the sequences around the splice sites are presumably involved in RNA splicing, the function of the other conserved sequences is unknown. Indeed the function of the intron itself in these genes is an enigma, although there are at least two possibilities. Firstly, as suggested by Molloy et al. (1982), the intron allows differential splicing whereby the same gene is expressed in a different tissue (e.g. scale or claw) or developmental stage (adult feather) using a different promoter and 5’ non-coding region in a manner analogous to that of mouse cc-amylase and human dystrophin genes (Young et al., 1981; Nude1 et al., 1989) or, secondly, the intron serves an impor- tant’ regulatory function.

Accumulating evidence strongly suggests that the first possibility is incorrect. DNA sequencing of the 5’ flanking regions of some of the feather keratin genes in the clone EFKl indicates that there are no alternative 5’ ends within 400 bp of the normal cap site (Gregg & Rogers, 1986). Moreover, primer extension studies indicate that, at least in the case of the chick feather keratin genes in EFKl, the same mRNL4 start site is probably used in both embryonic and adult feathers, which are the only two epidermal tissues where these genes are expressed (Presland et al., 1989, and our unpub- lished results). This conclusion is supported by a comparison of the DNA sequence obtained from embryonic and adult, feather RNA using a primer

from a conserved portion at the 5’ end of the coding region (Gregg, unpublished results).

With regard to the second possibility, that the intron has a regulatory function, Koltunow et al. (1986) have observed that removal of the intron from a feather keratin gene (gene 13) significantly increased its transcription in Xenopus oocytes. The effect observed was specific for the feather keratin intron fragment, since no inhibitive effect was observed when the intron was replaced with a similar-sized fragment’ from pBR322, suggesting that specific DNA sequences within the intron rather than the spatial separation of the different promoter elements vere responsible for this inhibi- tion of transcriptio??. These “putative” regulatory sequences ought to be conserved between the feather keratin genes and, indeed, there are intronic sequences additional to those thought to be involved in RNA splicing that are highly homo- logous among the four sequenced EFKl genes (Fig. 2). From these data, it was proposed that the intron sequences function to keep these genes t’ran- scriptionally silent until the correct elements combine to stimulate transcription (Koltunow et al., 1986). Such “correct elements” would include protein factors that bind to the 5’ non-coding or flanking regions. All keratin gene families examined to date appear to contain an intron in the 5’ non- coding region 20 to 23 bp from the init,iation codon; this list currently comprises feather (this work), feather-like (Presland et al., 1989), scale (Gregg ef al., 1983; Wilton, 1983) and claw keratin genes (Whitbread et al., unpublished results). In addition, the sequenced HRP gene has an intron in a position identical with that of the feather keratin genes (Rogers, 1985). Therefore, this hypothesis, if correct, would apply to the different keratin gene families expressed in each of the epidermal appen- dages of the chicken as well as the co-expressed HRP genes. The speci$city of gene expression would reside in the specific trans-acting factors present in each tissue.

(b) Expression of the M’FKl genes in the chick feather

By Northern blot analysis, we have demonstrated here that genes A to E of the recombinant XFKl are all expressed in the keratinizing chick feather but at somewhat varying levels (Fig. 5(a)). Gene E was already known to be expressed, as it was shown by DNA sequencing to be identical with the partial length cDNA clone pCFK17 (Kuczek, 1980).

Primer extension analyses of RNA isolated from adult chick feathers indicate that all five of the feather keratin sequences present in XFKl are also expressed in adult feathers (Presland et al., 1989, and data not shown). These results correlate with an earlier protein chemical study by Rothnagel (1979). who showed that the keratm proteins present in embryonic and adult feather tissues appear to be similar, except for the presence of two additional keratin bands in adult feather.

558 R. B. Presland et al.

(c) Expression of a scale keratin gene in embryonic feather tissue

Unexpectedly, the DNA sequencing of the cDNA clone pCFK15, isolated from a feather cDNA library, identified it as a typical scale keratin (Morris, 1984). The expression of this gene in embryonic feather tissue was confirmed by Northern blot analysis of feather mRNA using a scale-specific probe derived from pCFK15 (Fig. 6). This scale-like gene was not expressed exclusively in feathers, since an identical mRNA was found in a scale cDNA library (Wilton et aZ., 1985), which is consistent with the observation (Dhouailly et al., 1978) that proteins of the same size as scale keratins (i.e. 14,000 daltons) are present in prot,ein extracts from embryonic feathers. Quantification of the amounts of scale mRNA in the embryonic feather by densitometry suggests that they are present at about 1 o/o of the level of feather keratin proteins, which is in reasonable agreement with the number of scale cDNA clones detected in the feather cDNA library (95%; Morris, 1984). Additionally, at least one claw keratin gene is expressed at, a very low level in the embryonic feather, estimated to be 10.1 y. of its level in claw tissue (Whitbread et al., unpublished results).

Why then are some scale and claw proteins expressed at all in the feather? There are two poss- ible explanations. Firstly, the plucked feathers from which the feather RNA was prepared may have been contaminated with scale tissue. However, this is considered to be unlikely for the following reasons. (1) Great care was taken, when the feathers were removed from the embryos, not to pick feathers from the leg regions, which are likely to be synthesizing scale keratins by day 14 of develop- ment (Wilton, 1978). (2) The fact that two different RNA preparations (one from which the feather cDNA library was constructed (Saint, 1979) and the other used for the Northern blot’ (Fig. 6)) and at, least one feather protein ext’ract, (Dhouailly et al., 1978) all contain scale-like mRNA or protein, which, moreover, are present in similar amountjs relative t,o the feather keratins (see above).

A more plausible explanation is that scale and feather keratin genes may be switched on by similar, but different, signals and the observed expression of scale genes in feather is due to “promoter leakage”. Comparison of the available scale and feather gene sequences indicate some similarities in the 5’ non-coding and flanking regions that may explain this “inappropriate” expression (Wilton, 1984). By contrast with the scale and claw keratin genes, at least some of the feather keratin genes are not expressed at detectable levels in other keratinizing chick epidermises (Presland et aZ., 1989, and data not shown), corroborating earlier electro- phoretic studies on the protein composition on scale, beak and claw tissues that demonstrated the complete absence of feather keratin proteins (Dhouailly et al., 1978; Gibbs et al., unpublished results). The finding that scale and claw genes are expressed in feather and apparently not, vice verRa,

might be explained in terms of the evolution of these gene families and their trans-acting factors. Since feathers probably evolved from scales (Spearman, 1966; Gregg et al., 1984), the expression of some scale and claw genes in feather could be a remnant of vertebrate evolution. Thus, some scale genes are recognized to some degree by the feather gene-transcriptional factors but feather keratin genes are not recognized by keratin trans-acting factors present in scale tissue. Additionally, scale and claw genes are presumably still evolving and some mutations in their promoter regions might allow a gene to be responsive to factors in both feather and other epidermises. Experimental proof of this hypothesis requires the isolation and charac- terizat’ion of the protein(s) that regulate the in elivo transcription of feather and scale keratin genes.

We thank Dr Steve Dalton for providing the chick p-actin cDNA clone, Dr Barry Powell for a critical reading of this manuscript, Hania Presland and Lel Whitbread for preparation of diagrams. Tmbi Semenov for technical assistance and Ros Murrell for typing this manuscript. We are grateful to the Australian Research Grants Scheme for financial support of this project (to G.E.R.). R.B.P. was a recipient of a postgraduate award under the Commonwealth Scholarship and Fellowship Plan.

References Arai, K. M., Takahashi, R., Yokote. Y. & Bkahane, K.

(1983). Eur. J. Biochem. 132, 501-507. Baltimore, D. (1981). Cell. 24, 592-594. Berget, S. M. (1984). Nature (London), 309, 179-182. Breathnach, R. & Chambon. P. (1981). ilnnu. Rev.

Biochem. 50, 349-383. Brooker, J. D., May, B. K. & Elliott W. H. (1980). Eur.

J. Biochem. 106, 17-24. Buratowski, S., Hahn, S.. Sharp, P. A. & Guarente, I,.

( 1988). Nature (London), 334, 37-42. Cleveland, D. W., Lopata, M. A., MacDonald, R. *J.,

Cowan, N. J., Rutter, W. *J. & Kirschner, M. W. (1980). Cell, 20, 955105.

Dale, B. A., Resing, K. A., Haydock, P. V., Fleckman, P.. Fisher, C. & Holbrook. K. (1989). In The Biology of Wool and Hair (Rogers, G. E., Reis, P. J.. Ward. K. A. & Marshall, R. C.. ed), pp. 97-115, Chapman and Hall. London and New York.

Dhouailly. D.. Rogers, G. E. & Sengel. P. (1978). Z)r~eZop. Biol. 65. 58-68.

Diaz-Ruis, J. R. & Kaper. <J. M. (1978). Prep. Biochem. 8, l-17.

Dodgson, J. B., Strommer, J. & Engel, J. D. (1979). &ZZ. 17, 879887.

Efstratiadis, A., Posakony. J. W.. Maniatis. T., Lawn. R. M,, O’Connell, C., Spritz, R. A.. De Riel, ,J. K.. Forget. B. G., Weissman, S. M.. Slightom. ,J. L.. Blechl, A. E.. Smithies. 0.. Baralle, F. E.. Shoulders. c’. C. & Proudfoot, S. ,J. (1980). Cell, 21: 6533668.

Fraser. R. D. B., MacRae, T. P. & Rogers, G. E. (1972). Keratins, Their Composition, Structure and Biosynthe- sis, C. C. Thomas, Springfield, IL.

Gil, A. & Proudfoot, N. J. (1984). ,Vature (London), 312, 473-474.

Gregg. K. & R’ogers, G. E. (1986). In The Biology of the Zntegumrnt (Bereit’er-Hahn. J.. Mat,oltsv. rl. (:. &

Azlian Keratin Ge,nes 5.59

Richards, K. S., rds), vol. 2. pp. 666694. S1uingrr- Verlag. Berlin. Heidelberg and New York.

Gregg. K.. Wilton. S. I)., Rogers, (:. E. & Molloy, I’. I,. (1983). In Munipulafion and Expression of Grnrs in Eukaryotus (Nagley. l’.. Linnane. A. \V., Peacocxk. R. *J. & Pateman. .J. A.. rds). 1’1). 65 72. AcGademic Press. Svdney.

(Gregg. Ii.. ~$ilton. S. I).. Parry. I). A. 1). & Rogers. ($. K. (1984). EXKO./. 3. 175-178.

lat’rou. K.. Tsitilou. S. C. & Kafatos, F. (1. (1981). Proc. Sat. ;Icml. ai.. f’.S.A. 81. wi%U56.

Jackson. !J. A. 8 Fink. (:. R. (1981). S&m (London). 292. 306-3 1 I

Kanehisa. M. (198”). Nucl. Acids Res. 10. 183-196. Kanehisa. M. (1984). Surl. Acids Res. 12. %03-“13. Keller, E. B. & Soon. W. A. (1984). Pm-. Nat. Acad. Sri..

f’.s.A4. 81. 7417%7xX). Krm1). D. .J. 8 Rogers. (:. E. (1972). RiocAenristry, 11.

960-975. Klein, H. L. 8 Prtrs. T. 1). (19X1). Naturr (London). 289.

14~14% Koltunou. A. M.. Gregg, Ii. B Rogers. (:. E. (1986). Surl.

Acids Res. 14. 6375.-6393. Kozak. M. (1986). (‘~11. 44. tLK-Z9Z. Kraninkel. 1’.. Zoebelein. (i.. Briiggemann, 11..

Radbruch. A. B Rajewsky. K. (1983). Pror. Xc/t. drnd. Sri.. C:.A’.A. 80, 4997-5001.

Kuczek. F:. S. (1980). R.&B. Honours thesis. Ilniversity of Adelaide. South Australia.

Maniatis. T., Frit,sch. E. F. B Sambrook. .J. (1982). Editors of Jf oleculccr (‘loniny: d Laboratory Manual. (‘old S1xing Harbor I,aboratory Press. (‘old S1iring Harbor. XT.

McLauchlan. ,J.. (iaffnry. D.. Whitton. ,J. L. & (Elements. *J. B. (1985). ~Vucl. .4cids Hus. 13. 134771368.

Meinkoth. ,J. & W&l. Cl. (198-C). .4&. ~io&~a. 138, 267 2X-C.

Messing. J. & Vieira. J. (198%). firnr. 19. d69-276. Molloy. I’. I,.. Powell. t1. C.. Gregg. K.. Rarone. JC. I). &,

Rogers. (:. E. (1982). Nucl. Acids RPS. 10. 6007P60:!1. Morris. (‘. T’. (198-l). Ph.D. t,hrsis. 1Tnirrrsitg of Adelaide,

South Australia. h’orrantlpr. .J.. Krmpe. T. 8: Messing. J. (1983). (irrar. 26.

101-106. Xudrl. IT.. Zuk. 11.. Einat. I’.. Zrelon. K.. Levy. Z..

Sruman. S. & Yaffe. D. (1989). Xtrture (J,ondwc). 337. 76 7X.

Ost’rom, J. H. (1976). Linnnran Nor. London Hiol. .I. 8. 91-18”.

Powell. K. c’. & Rogers, G. E. (1986). In Hioloyy of thv Integument (Bereiter-Hahn. J.. Matoltsy. A. (:. & Richards. K. S.. ~1s). vol. 2. ~1). 695 $21. SPringrr- Verlag. Berlin. Heidelberg ant1 New York.

Powell. R. C’.. Kemp, 1). .I.. Partingt~on. (:. A.. Cihhs. P. E. 11. 8: Rogers. G. E. (1976). Hiochrm. Rioph;qs. Rus. C!ommun. 68. 1~63~1~71.

Powell, B. C’.. Sleigh. M. .J., Ward. K. A. B Rogrrs. (:. E. (1983). Sucl. Acids Rrs. 11. 53%7-5346.

Prrslantl. R. K., Whitbread. L. A. & Rogrrs. (:. E. (19~9). .J. Mol. Biol. 209, 561-576.

Rogers, C. E. (1984). Hiochr~c. 8oc. Symp. 49. 85~108. R’ogers. G. E. (1985). Ann. S. 1.. ;Irrcd. Sci. 455. 10:3-~&Z.‘,. Rothnagel. J. A. (1979). B.Sc. Honours thrsis. University

of Adelaide. Sout,h Australia. Saint,, R. B. (1979). Ph.D. thesis. rnivrrsity of Adrlaidr,

South Australia. Sanger. F.. Nicklen. S. Cy (‘oulson, A. R. (1977). /‘ma. ~c’ut.

dead. Sri.. I:.S.=l 74. .546335467. Sanger. F.. (louIson. A. R,.. Ilarrrll. R. C:.. Smith. A. .I, H.

8r Roe. K. A. (1980). .J. Mol. Biol. 143. 16l~l78. Shen. S.-L.. Slightom, J. L. b Smithicx 0. (1981). (‘r/l,

26, 191-203. Sliphtom. ,J. L.. Hlechl. A. I% Br Smithies, 0. (1980). (‘~11.

21. 627-638. Smith. G. 1’. (1976). A'ciencr. 191. 5%8--535. Staden. R,. (19Hln). Surl. Arids Krs. 12. .iL’lL.‘,SX. Staden. R. (19846). Surl. Acids Rrs. 12. 551-567. S1iearman. R,. I. t’. (1966). Hiol. Rrv. 41. 59996. Thomas, P. S. (1983). M&ods Enzymol. 100. 255-266. Wahl, G. 11,. Stern, M. & Stark. G. R. (1979). Pror. Sat.

.-lrnd. Sri., f’.~S.d. 76, 3683.-3687. \Valker. 1. I>. 8r Ropers. (:. E. (1!176n). Ewr. .J. Riochum.

69. 329-339. Walker. I. D. & Ropers. (:. E. (19766). CU. .I. Hiorhpm.

69. 34 1-360. Wilton. S. D. (19iX). 11.S~. Honours t’hesis. l’niversity of

Adelaide. Sout’h Australia. IVilton. S. D. (1983). Ph.D. thesis. I:nivrrsity of Adelaide.

South Australia. Wilton. S. I).. C’rockrr. 1,. A. & Rogers. (4. E. (1985).

Kiochim. Hiophys. &a. 824. %ll--208. Young. R. A.. Hagenhurhle. 0. & Schihlrr. I’. (1981).

(‘~11. 23, -&51--&5X.

li:ditrd by B. Mach