15
The Plant Cell, Vol. 8, 1421-1435, August 1996 O 1996 American Society of Plant Physiologists Structure and Expression of a Plant U1 snRNP 70K Gene: Alternative Splicing of U1 snRNP 70K Pre-mRNAs Produces Two Different Transcripts Maxim Golovkin and Anireddy S. N. Reddyl Department of Biology and Program in Cell and Molecular Biology, Colorado State University, Fort Collins, Colorado 80523 The product of the U1 small nuclear ribonucleoprotein particle (U1 snRNP) 70K (Ul-70K) gene, a U1 snRNP-specific pro- tein, has been implicated in basic as well as alternative splicing of pre-mRNAs in animals. Here, we report the isolation of full-length cDNAs and the corresponding genomic clone encoding a U1-70K protein from a plant system. The Arabidop- sis U1-70K protein is encoded by a single gene, which is located on chromosome 3. Severa1 tines of evidence indicate that two distinct transcripts (short and long) are produced from the same gene by alternative splicing of the U1-70K pre- mRNA. The alternative splicing involves inclusion or exclusion of a region (910 bp) that we named “included intron.” Two transcripts were clearly detectable in all tissues tested, and the leve1 of the transcripts varied in different organs. The deduced amino acid (427 residues) sequence from the short transcript has strong homology to the animal U1-70K protein and contains an RNA recognition motif, a glycine hinge, and an arginine-rich region characteristic of the animal U1-70K protein. The long transcript has an in-frame translational termination codon within the 910-bp included intron, resulting in a truncated protein containing only 204 amino acids. The protein encoded by the short transcript is recognized by U1 RNP-specific monoclonal antibodies and binds specifically to the Arabidopsis U1 snRNA, whereas the protein from the long transcript does not. In addition, multiple polyadenylation sites were observed in the 3’ untranslated region. These results suggest a complex post-transcriptional regulation of Arabidopsis U1-70K gene expression. INTRODUCTION Most eukaryotic pre-mRNAs contain one or more introns that are removed in the nucleus to produce functional mRNAs. In addition, some pre-mRNAswith multiple introns display com- plicated patterns of alternative splicing (Smith et al., 1989). Recent studies indicate that the basic and alternative splicing of pre-mRNAs plays an important role in regulating gene ex- pression during development and differentiation, and in producing structurally and functionally different proteins from a single gene (Smith et al., 1989; Sharp, 1994). Splicing of a nuclear pre-mRNA involves two sequential trans-esterification reactions, which take place in a large complex called the spliceosome (Sharp, 1994). The four main components of the 60s spliceosome are U1, U2, U4/U6, and U5 small nuclear ribonucleoprotein particles (snRNPs) and many non-snRNP splicing factors (Guthrie, 1991; Sharp, 1994). The snRNPs recognize splice sites and branch point sequences and aid in splicing (Guthrie, 1991; Sharp, 1994). The snRNPs have been isolated and extensively character- ized in animal systems. Approximately 40 proteins associated with these snRNPs have been identified. Among them, eight proteins (B, B: D1, D2, D3, E, F, and G), called common or ’To whom correspondence should be addressed core proteins, are shared by all snRNPs, whereas the restare unique to a particular snRNP (Lührmann, 1988; Andersen and Zieve, 1991). The mammalian U1 snRNP contains a 164- nucleotide U1 snRNA molecule and at least 11 proteins, in- cluding three U1-specific proteins (U1-A, U1-C, and U1-70K). Genetic and biochemical analyses have shown that the U1 snRNP binds to the 5’splice site of pre-mRNAearly in the for- mation of spliceosome (Mount et al., 1983; Rosbash and Séraphin, 1991). Recognition of the 5‘ splice site by the U1 snRNP involves base pairing between complementary se- quences of U1 snRNA and the 5’splice site. U1 snRNP-specific proteins are required for efficient formation of a complex be- tween U1 snRNA and the 5’ splice site junction (Mount et al., 1983; Heinrichs et al., 1990). There is also evidence that the U1 snRNP interacts with the U2 snRNP during splicing (Black et al., 1985). More recent studies also implicate the U1 snRNP in alternative splicing (Krainer et al., 1990; Ge and Manley, 1991; Kuo et al., 1991). Molecular cloning and sequence analysis of the cDNAs en- coding the U1-70K protein from humans, frogs, Drosophila, mice, and yeast have provided important information about the primary structure of the protein and various functional do- mains (Theissen et al., 1986; Spritz et al., 1987; Etzerodt et al., 1988; Hornig et al., 1989; Query et al., 1989; Mancebo et Downloaded from https://academic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 October 2021

Structure and Expression of a Plant U1 70K Alternative Splicing of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

The Plant Cell, Vol. 8, 1421-1435, August 1996 O 1996 American Society of Plant Physiologists

Structure and Expression of a Plant U1 snRNP 70K Gene: Alternative Splicing of U1 snRNP 70K Pre-mRNAs Produces Two Different Transcripts

Maxim Golovkin and Anireddy S. N. Reddyl

Department of Biology and Program in Cell and Molecular Biology, Colorado State University, Fort Collins, Colorado 80523

The product of the U1 small nuclear ribonucleoprotein particle (U1 snRNP) 70K (Ul-70K) gene, a U1 snRNP-specific pro- tein, has been implicated in basic as well as alternative splicing of pre-mRNAs in animals. Here, we report the isolation of full-length cDNAs and the corresponding genomic clone encoding a U1-70K protein from a plant system. The Arabidop- sis U1-70K protein is encoded by a single gene, which is located on chromosome 3. Severa1 tines of evidence indicate that two distinct transcripts (short and long) are produced from the same gene by alternative splicing of the U1-70K pre- mRNA. The alternative splicing involves inclusion or exclusion of a region (910 bp) that we named “included intron.” Two transcripts were clearly detectable in all tissues tested, and the leve1 of the transcripts varied in different organs. The deduced amino acid (427 residues) sequence from the short transcript has strong homology to the animal U1-70K protein and contains an RNA recognition motif, a glycine hinge, and an arginine-rich region characteristic of the animal U1-70K protein. The long transcript has an in-frame translational termination codon within the 910-bp included intron, resulting in a truncated protein containing only 204 amino acids. The protein encoded by the short transcript is recognized by U1 RNP-specific monoclonal antibodies and binds specifically to the Arabidopsis U1 snRNA, whereas the protein from the long transcript does not. In addition, multiple polyadenylation sites were observed in the 3’ untranslated region. These results suggest a complex post-transcriptional regulation of Arabidopsis U1-70K gene expression.

INTRODUCTION

Most eukaryotic pre-mRNAs contain one or more introns that are removed in the nucleus to produce functional mRNAs. In addition, some pre-mRNAs with multiple introns display com- plicated patterns of alternative splicing (Smith et al., 1989). Recent studies indicate that the basic and alternative splicing of pre-mRNAs plays an important role in regulating gene ex- pression during development and differentiation, and in producing structurally and functionally different proteins from a single gene (Smith et al., 1989; Sharp, 1994). Splicing of a nuclear pre-mRNA involves two sequential trans-esterification reactions, which take place in a large complex called the spliceosome (Sharp, 1994). The four main components of the 60s spliceosome are U1, U2, U4/U6, and U5 small nuclear ribonucleoprotein particles (snRNPs) and many non-snRNP splicing factors (Guthrie, 1991; Sharp, 1994). The snRNPs recognize splice sites and branch point sequences and aid in splicing (Guthrie, 1991; Sharp, 1994).

The snRNPs have been isolated and extensively character- ized in animal systems. Approximately 40 proteins associated with these snRNPs have been identified. Among them, eight proteins (B, B: D1, D2, D3, E, F, and G), called common or

’To whom correspondence should be addressed

core proteins, are shared by all snRNPs, whereas the restare unique to a particular snRNP (Lührmann, 1988; Andersen and Zieve, 1991). The mammalian U1 snRNP contains a 164- nucleotide U1 snRNA molecule and at least 11 proteins, in- cluding three U1-specific proteins (U1-A, U1-C, and U1-70K).

Genetic and biochemical analyses have shown that the U1 snRNP binds to the 5’splice site of pre-mRNA early in the for- mation of spliceosome (Mount et al., 1983; Rosbash and Séraphin, 1991). Recognition of the 5‘ splice site by the U1 snRNP involves base pairing between complementary se- quences of U1 snRNA and the 5’splice site. U1 snRNP-specific proteins are required for efficient formation of a complex be- tween U1 snRNA and the 5’ splice site junction (Mount et al., 1983; Heinrichs et al., 1990). There is also evidence that the U1 snRNP interacts with the U2 snRNP during splicing (Black et al., 1985). More recent studies also implicate the U1 snRNP in alternative splicing (Krainer et al., 1990; Ge and Manley, 1991; Kuo et al., 1991).

Molecular cloning and sequence analysis of the cDNAs en- coding the U1-70K protein from humans, frogs, Drosophila, mice, and yeast have provided important information about the primary structure of the protein and various functional do- mains (Theissen et al., 1986; Spritz et al., 1987; Etzerodt et al., 1988; Hornig et al., 1989; Query et al., 1989; Mancebo et

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

1422 The Plant Cell

al., 1990). The U1-70K protein contains a conserved 80-amino acid RNA binding motif in the N-terminal region known as the RNA recognition motif (RRM) and a glycine-rich region next to the RRM (Bandziulis et al., 1989; Query et al., 1989; Kenan et al., 1991). Within this RRM, two highly conserved regions (RNP2 and RNP1) are present. The C-terminal half of U1-70K contains highly charged argininekerine (RS)-rich regions that contribute to the abnormal migration of this 52K protein on SDS-polyacrylamide gels (Query et al., 1989). Among animals, the U1-70K protein is highly conserved, with 88% amino acid identity between the Xenopus and human proteins. However, the yeast homolog of U1-70K has only 30% overall amino acid identity (Smith and Barrell, 1991).

It has been demonstrated that U1-70K protein binds to stem-loop I at the 5'end of the U1 snRNA (Query et al., 1989). More recent studies have shown the interaction of U1-70K with splicing factors (SC-35 and ASF/SF2) through a specific as- sociation between the RS-rich domain of U1-70K and a similar region in splicing factors (Wu and Maniatis, 1993; Kohtz et al., 1994). Furthermore, overexpression of the RS-rich region of U1-70K has been shown to inhibit splicing as well as nucleocytoplasmic transport of mRNA (Romac and Keene, 1995). These studies indicate an important role for U1-70K in splicing. The U1-70K protein is the only protein in the U1 snRNP that is heavily phosphorylated in vivo, and the phosphoryla- tion sites are confined to a highly charged C-terminal RS-rich region (Woppmann et al., 1993).

Little is known about the mechanisms by which plant cells excise introns from pre-mRNAs. Although some structural fea- tures of plant introns are similar to ones of animal introns, there are severa1 features unique only to plants (Goodall et al., 1991). Many, although not all, plant introns can be processed faith- fully in animal systems, whereas vertebrate introns are not processed in plant cells (Brown et al., 1986; Hartmuth and Barta, 1986; van Santen and Spritz, 1987a; Wiebauer et al., 1987). The cis-acting signals that are necessary for intron removal in plants differ from those in vertebrates and yeast. In plants, especially in dicots, high A and U nucleotide con- tent in the intron is necessary for intron processing (Goodall and Filipowicz, 1989, 1991; Lou et al., 1993; McCullough et al., 1993; Luehrsen and Walbot, 1994a, 1994b; Luehrsen et al., 1994). In addition, plant introns lack a polypyrimidine tract at the 3'end and a conserved branch point sequence (Luehrsen et al., 1994).

U1 snRNAs have been isolated from bean (van Santen and Spritz, 1987a), pea (Hanley and Schuler, 1991a), tomato (Abel et al., 1989), wheat (Musci et al., 1992), and potato (Vaux et al., 1992). As in animals, U1 snRNAs are coded by multiple genes and are clustered in plants (Hanley and Schuler, 1991a; Musci et al., 1992; Vaux et al., 1992). The U1 snRNA is the most abundant of the snRNAs present in eukaryotic cells and is essential for pre-mRNA splicing (Smith et al., 1989; Guthrie, 1991). There is some evidence that U1 snRNA genes are differentially expressed during development (Egeland et al., 1989; Hanley and Schuler, 1991b). Although U1 snRNAs have been well characterized in plants, little is known about the pro- tein components of the U1 snRNP and their functions in

splicing. Recently, a cDNA encoding a U1 snRNP-specific pro- tein A from plants was cloned and characterized (Simpson et al., 1995).

Because the U1-70K protein is known to have multiple roles both in constitutive and alternative splicing, we have been in- terested in characterizing the plant homolog and its role in plant pre-mRNA processing. Here, we describe the isolation and structural characterization of an Arabidopsis U1-70K gene and show that two distinct transcripts are produced from a single gene by alternative splicing of a large included intron. The two transcripts are differentially expressed in different parts of the plant. The protein encoded by the short cDNA is recognized by U1 RNP-specific monoclonal antibodies and binds to Arabidopsis U1 snRNA, suggesting that it is a true homolog of U1-70K protein.

RESU LTS

lsolation of a Full-Length cDNA Encoding the U1-70K Protein

We used a partia1 cDNAfrom Arabidopsis (Reddy et al., 1992) to screen a flower bud cDNA library prepared in a hZAPll vec- tor. Thirty-two positive clones were isolated after three rounds of screening. All of the isolated clones contained a common Xhol fragment of -600 bp but varied greatly in length and in other restriction sites. Detailed restriction analysis of the iso- lated clones revealed the presence of two distinct cDNA variants in the isolated clones. Two clones, 23 and 24, repre- sentative of each of the two groups were chosen for further analysis. Of the 32 isolated clones, approximately one-third belonged to the clone 24 group, whereas the rest were of the clone 23 type. To determine whether both of the cDNAs are derived from the same gene or two different genes, we probed duplicate genomic DNA gel blots with either clone 23 or clone 24. As shown in Figure 1, similar hybridization signals were obtained with both probes, indicating that the two different cDNAs are derived from the same gene. To determine the differ- ences between the two clones, we completely sequenced clone 23 (1.5 kb) and clone 24 (2.5 kb). Comparison of the sequences of clones 23 and 24 showed that the cDNAs are identical ex- cept that clone 24 contains an additional 61-bp sequence in the 5' untranslated region and a 910-bp sequence in the mid- dle of the coding region, suggesting that alternative splicing may be involved in the generation of two different transcripts (Figures 2 and 3).

Multiple Polyadenylation Sites Generate Diversity in Transcripts

Sequence analysis of the 3' untranslated region of 14 cDNA clones revealed an extensive variation in the 3'end of U1-70K cDNAs due to multiple polyadenylation sites. As shown in Fig- ure 2, at least nine poly(A) addition sites were detected within

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

Alternative Splicing of U1 snRNP 70K Pre-mRNAs 1423

23.1

4.4

Sail Clal BamHI Sail Clal BamHI

ft ••ing the generation of two transcripts from a single gene isshown in Figure 3. The longest open reading frame that codesfor 427 amino acids starts with an ATG codon (exon 1) andends at a TGA stop codon in exon 9. Upstream of the initiation

2.0

cDNA23 cDNA24Figure 1. Arabidopsis Genomic DNA Gel Blot Analysis.

Seven micrograms of DNA was digested with Sail, Clal, and BamHI,and duplicate blots were prepared. One blot was probed with the shortcDNA (clone 23), and the other blot was probed with the long cDNA(clone 24). DNA probes were labeled with 32P-CTP by random primelabeling. Numbers at left indicate the length of DNA markers inkilobases.

a 67-nucleotide region. Site 3 appears to be the most commonpolyadenylation site because five of 14 cDNAs analyzed hada poly(A) tail at site 3, whereas the other nine cDNAs werepolyadenylated at different sites. Multiple polyadenylation sitesare not uncommon in plant mRNAs, although the function ofsuch variation is not clear, lt is possible that varying lengthsof 3' untranslated region may have a regulatory role in mRNAtransport, stability, and/or translational efficiency of the tran-scripts. In most vertebrate mRNAs, there is a single poly(A)site. A hexanucleotide (AATAAA) sequence that acts as a poly-adenylation signal is almost always present 10 to 30 nucleotidesupstream of the cleavage site of animal mRNAs (Proudfootand Brownlee, 1976). This canonical poly(A) signal is absentin all of the cDNAs that have been sequenced. This is not un-usual because 60% of known plant genes do not contain aperfectly conserved poly(A) signal (Hunt, 1994; Wu et al., 1995).

D S G D P F L R N P N A A V Q A R A K V Q N R A N V L Q L K 3 2Cltggttcc^ctcttctctct^aCtcgctgaattcgcgattactctcta^ttttagccta^Cgtctattaggtttctagctaatga 270

L 33360450540

gattccgatttcaggtcatatgatCtgattaLdttggtcttgatCcttgagctcgtCCCacagtttt^atcagtttgattgaCaccttgatMtggatctatttgatatgcaGftTOIJICTGra^

jtataaaettgtatatctstactaig

t^tagCnATtlXTICWG M A Q F V S N F A E P G D P

A A E D L K K

oca^inrJOtAAAQOWAflS O K R E R I H K L R L E K G V E K

tgt tat ttgtgt tcctt t taatgccct et taggtttgagaat tct tagttaat t tg

ATCATTXAAAT^^

gtgtttct tt tt t tcttactgttgaatgact tt tgt tagacatctgagt tatgttgtgcttaactttggtctgttcttgtct CgcagAflCN

»TIAAAAGa^^ tattctga/caC tccct tgctggt

aatcggtxigxit C tgtgtaatggct c cccC ctetagagcccc ttgctcaatt tgt 1 1 Cat Ct Latctaoaaagtct. tccaaggttagt.S V R I C V M A S L S R A L C S I C F I L S T K V F Q G stop

^

cacuccccgtc^Cgttttctgaatctat^tgagctf^cccaccctgc'tgtaagaa

act cacaa»atat tatctgcccatatccoiaagccac ccaaatgcr t tact c tcaagcaaccaaggt tggat tatqgaaaacccttctctctCcccctaCctcaacaCctaeaca^ttGagtacctgggacctgggctactacgiaaggtgacctggtccaccctcMaCacatttccagC tatctaa^ ttccat^co^ggtaaggggtg^t t^t Cagactct tcaaaaatgg^

atCTtcttcttcCga^V H L V T D Q

et t ttctt tcaaccL T N K P K G Y A F 1 E Y M H T R D M K

et et tt tatctccctgagatgt tCgctctt tatactgatgttgcat tgtgtt ttgctactcttcctacagCTO3lTATAMaAGCTGATA A Y K 0 A D

ayttMK*TX3ceccN3>JGW'rr^^G Q K I D G R R V L V D V E R G R T V P N W R P R R L G C G

CITO33AOTa\AGAOraGCTCX7ra^L G T S R V G G G E E I V G E Q Q P Q G R T S Q S E E P S R

CCACO3GAAG«XG^taaagatatt ttcat tcgtaccaaactcttttgccttgtgactaatat tgtgctgataatgttgt taccttgtctP R E E R

co»tfIGftGVATC*rcxn3AAAAGG<»AAAGAW«^^E K S R E K G K E R E R S R E L S H E Q P R E R S R D R

ay&&cmvwjya3*3Gf&>c^P R E D K H H R D R D O C C R D R D R D S R R D R D R T R DGN33KWaaK»CDXCCTWaXn^^R G D R D R R D R D R G R D R T S R D H D R D R S R K K E R

ATCfcXATCAGGGTGG^^

ttUAtt3GGGACACK3C03^AQCICATarrAC^^

CGTTWTCITrCATirxririGITiaAAftCTCTAAAAACT^

roMcTTtaiuuL'ixriu'riit'iLrnTtaflGirroixATATrATi

ttVCATUlqk

6307172096810»•9001169901231080145117014C1260ICC1350

144015301620171016001990193020702160173225019323402002430230252026026102652700293279032328603532970383306041331504273240

3314

Structural Organization of the Plant U1-70K Gene

Genomic clones were isolated to determine whether the twotranscripts are generated from a single gene and to analyzethe structure of the U1-70K genes. Restriction analysis of 20genomic clones indicated that all clones contain the overlap-ping regions of the same gene. A clone (\ 5) that containedthe entire gene was subcloned and sequenced. The nucleo-tide sequence of the genomic clone, two cDNA clones (23 and24), and the predicted amino acid sequence of the U1-70K pro-tein are shown in Figure 2. Comparison of the nucleotidesequence of the genomic clone with the two cDNAs indicatedthat the gene contains nine exons and that the two cDNAs arederived from the same gene by alternative splicing of a 910-bp region, which we named "included intron." A diagram depict-

Figure 2. Nucleotide and Deduced Amino Acid Sequences of theU1-70K Gene.

Exons are shown in uppercase letters, introns are presented in lower-case letters, and the included intron is shown in lowercase italics. Thepredicted amino acid sequence is shown under the nucleotide se-quence. The predicted amino acid sequence from the included intronis shown in italics. Numbers at right correspond to nucleotides anddeduced amino acids (boldface). An in-frame stop codon precedingthe initiation codon is underscored. Arrowheads 1 and 2 indicate thebeginning of the long cDNA (clone 24) and the short cDNA (clone 23),respectively. The short cDNA contains only the exons, whereas thelong cDNA contains all the exons plus a 910-bp included intron. Num-bers in the 3' untranslated region denote polyadenylation sites indifferent cDNAs. The sequence data of the U1-70K gene, cDNA clone23, and cDNA clone 24 were submitted to the EMBL, GenBank, andDDBJ data bases as accession numbers U52909, M93439, and U52910,respectively.

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

1424 The Plant Cell

m 23 cDNA

\ I /

- 500 bp

Exon - lntron 0 lncluded intron O Figure 3. Structural Organization of the Arabidopsis U1-70K Gene.

A schematic diagram showing the structural organization of the U1- 70K gene (middle) and the generation of short (cDNA 23) and long (cDNA 24) transcripts by alternative splicing. Shaded boxes represent exons (numbered from 1 to 9), and the open boxes represent the in- cluded intron.

codon, there is an in-frame stop codon, suggesting the ab- sence of a coding region upstream of the identified initiation codon. All exons, except the last exon, are relatively short (65 to 214 bp). The last exon is ~ 7 0 0 bp and represents almost half of the length of the U1-i0K mRNA. As in other dicots, in- trons in the Arabidopsis U1-70K gene are AU rich (59 to 76%) (Table 1). Splice sites (5'and 3') for all introns except for intron 1 are in agreement with consensus sequences (Table 1) (Goodall et al., 1991). None of the highly conserved nucleo- tides in introns at the 5' (GT) and 3' (AG) ends is present in

intron 1. However, we consider it to be a true intron because this sequence (307 bp) is not represented in any of the cDNAs analyzed and is rich in AU content (67%). The 5' or 3' splice sites of the included intron are as good a match as the 5' and 3'splice sites of other introns with the consensus sequence. Hence, the skipping of this intron in large transcripts is not dueto poor splice sites or low AU content (Table 1) but possi- bly to its size.

The human U1-70K gene contains 11 exons and is repre- sentative of animal U1-70K genes because the general structure of the gene is highly conserved during vertebrate evolution. Conserved features include the number and position of introns as well as the nucleotide sequence of an alternative in- cluded/excluded exon containing an in-frame translation termination codon. Because of the conservation of location and sequence of the alternatively spliced included/excluded exon among vertebrates, it has been suggested that alterna- tive splicing of the included/excluded exon may regulate the production of the functional U1-70K protein. The positions of four introns (4 to 7), including the alternatively spliced introns, are conserved between Arabidopsis and human U1-70K. How- ever, intron 6 is alternatively spliced in Arabidopsis (Figure 4, solid arrow), and intron 7 is alternatively spliced in human (Fig- ure 4, solid triangle).

U1-70K Gene 1s Mapped to Chromosome 3

Two subcloned fragments (3.7 and 6.2 kb) of genomic clone (X 5) harboring the U1-70K gene showed restriction fragment length polymorphism between the Landsberg erecfa and Columbia ecotypes with Xbal. These fragments were used to probe the mapping filters of 100 individuals. Analyses of the data indicate that the U1-70K gene is located on chromosome 3 between markers lbAT457 and DEFN712.

Table 1. Length, AT Content, and 5' and 3' Splice Site Junctions of lntrons in the Plant U1-70K Gene

5' Splice Sitea.b Length AT lntron No. (bD) P I O ) Exon lntron

3' SDlice Sitea,c

lntron Exon

1 2 3 4 5 lncluded intron 6 7

307 70 90 07 94

910 98 80

67 71 66 76 65 65 59 65

GCT C g CCT AAT CTT CGG A S GCC

TGGTTCC GTATCAA GTATGAT m T C T GTAAGCT GTAGGTT GTGCCCT GTAAAGA

ATATGCA AATATAG TGTC- TATGCAG CTTGCAG T T C T m CCTACAG TCTCCAG

- GAT =A TCG A I G AAC

C I G TGA

GTT

~ ~~

a Splice site junctions were determined by comparing the gene sequence with the cDNA sequences. Nucleotides that are identical to consen- sus sequences of 5' and 3 splice site junctions in dicots are underlined. b The consensus sequence at the 5' splice site junction in dicots is AG GTAAGT. Nucleotides that are identical to the consensus sequence are underlined. c The consensus sequence at the 3' splice site junction in dicots is TGYAG GT. Nucleotides that are identical to the consensus sequence are underlined.

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

Alternative Splicing of U1 snRNP 70K Pre-mRNAs 1425

Plant

Human

Plant

H m n

Plant

Human

Plant

Human

1 MGDSGDPFLRNPNAAVQARAKVQNWLQLKLMGQSHPTGLTNNLLKLF 50

1 ............................... MTQF .... LPPNLLALF 13 1 . 1 I . l I I I I

Q 51 EPRPPLEY . . . . . KPPPEKRKCPPYTGMAQFVSNFAEPGDPEYAPPKPEV 95

14 APRDPIPYLPPLEKLPHEKHHNQPYCGIAPY1REFE.DP ~ . . . A P P P T ~ 60 . I I . I : . I 1 ( . I / : . . ~ ~ . ~ : ~ . : : . : ~ . : [ 1 1 1 1 . . .

A o. . , -

96 ELPSQKRERIHKLRLEKGVEKAAEDLKCYDPNNDPNATGDPYKT~S~ 14 5 I . . : : . l i :: : : I : : .... : ~ I . : I I : I I I I I i i : : i im i : - r~

A 61 ETREERMERKRREKIERRQQEVETELKMWDPHNDPNACGDAFKTLFVARV 110

A A

V 1 -- - 146 &ESSESKIKREFESYGPIKRVHLVTDQLTNKPKCSYM~MHTRDMUA 195

I I I I I I I I I I I I I I I I I i i - i i ~ r i i ~ I I I I I KRSGKPRGYAFIEYEHERDMHSA 160

A 111

V 1 -- - 146 &ESSESKIKREFESYGPIKRVHLVTDQLTNKPKCSYM~MHTRDMUA 195

I I I I I I I I I I I I I I I I I i i - i i ~ r i i ~ I I I I I KRSGKPRGYAFIEYEHERDMHSA 160

A 111

Plant 196 YKQADQQXIWRRVLVDVERGRTVPNWRPRRLGGGLGTSRVGGGEEIV.. 243

Human 161 YKHADGKKIDGRRVLVDVERGRTVKGWRPRRLGGGLGGTRRGGADVNIRH 210 1 i : I I I ~ I I I I I I I I I I I I I I I I I ~ ~ I I I I I I I I I I I ~ ~ I 1 1 : : :

A . o .

Plant 244 GEQQPQGRTSQSEEPS .......... RPREEREKSREKGKERERSRELSH 283

Human 211 SGRDDTSRYDERPGPSPLPHRDRDRDRDRERERRERSRERDKERERRRSRSR 260 . . . . . . . . : I . : . . : I 1 1 . 1 1 I I : I I I : : I I l I I . I . I :

A

Plant 284 EQPRERSRDRPREDKHHRDfRDRDRDSRRDRDRTPDRGDFZIRRDRE 333

Human 261 DRRR.RSRSRDKEER8RSRERSKDKDRDRKRRSSRSRERARRERERKEEL 309 : . . I I I l . l . : l : : : : . : : I I I I _ _ I . _ I . I . I . I : I : I : :

Plant 334 RGRD... . . . . . . . . . . . . . . . . . . . . . . . . RTSRDHDRDRSRKKERDYE 356

Human 310 RGGGGDMAEPSEAGDAPPDDGPPGELGPDGPDGPEEKGRDRDRERRRSHR 359 I 1 : . . : _ : I I I _ I . : 1 . .

Plant 351 GGEYEHEGGGRSRERDAEY.RGEPEETRGYYEDDQGDTDRYSHRYDKMEE 405

Human 360 SERERRRDRDRDRDRDREHKRGERGSERGRDEARGGGGGQ.DNGLEGLGN 408 :: : : : ~ . ~ : ~ ~ 1 . ~ ~ ~ . : . . ~ ~ 1 . . I : _ : . .: . : : : :

Plant 406 D..DFRYEREYKRSKRSESREYVR..... 427

Human 409 DSRDMYMESEGGDGYLAPENGYLMEAAPE 437 I 1 : 1 . 1 : .... : I : .

Figure 4. The Arabidopsis U1-70K Protein

Alignment of Arabidopsis U1-70K protein (Plant) with human U1-70K protein (Human). Open arrows and open triangles indicate the loca- tion of introns in plant and human U1-70K genes, respectively. The solid arrow and solid triangle indicate the location of an alternative included intron in Arabidopsis and human U1-70K genes, respectively. Putative NLSs in plant U1-70K protein are shown in boldface letters. The RRM is underlined. RNP2 and RNPI consensus sequences within the plant RRM are shown in gray boxes. The vertical bars indicate identical amino acids; the colons and dots represent similar amino acids between plant and human U1-70 proteins.

The Deduced Amino Acid Sequence of the Plant U1-70K Protein Contains Key Domains Found in Animal U1-70K Proteins

Sequence analysis of both of the cDNAs revealed that the short cDNA (clone 23) has an open reading frame that codes for a protein of 427 amino acids (Figures 4 and 5). There are two possible initiation codons (ATG) in the Arabidopsis U1-70K gene that are in the same frame. An in-frame stop codon is present upstream of the first initiation codon, suggesting that the cod- ing region cannot be longer than the one we identified. The protein produced from the second initiation codon has an N-terminal sequence similar to that of vertebrate U1-70K,

whereas the protein produced from the first initiation codon has an additional 33 amino acids that are not present in the animal U1-70K protein. However, the first initiation codon is most likely the true initiation codon because of the consensus sequences (GCCAIGCCAUGG) surrounding it. In plants and other eukaryotes, purine is present at the -3 position in the majority (80 to 90%) of mRNAs (Cavener and Ray, 1991; Kozak, 1991). The first initiation codon has a purine at position -3, and other nucleotides surrounding the initiation codon are bet- ter matched with six of seven consensus nucleotides (GTCGCCAUGQ.

The presence of the first 33 amino acid residues in the Arabidopsis U1-70K protein is likely to be unique. The calcu- lated molecular m a s of the predicted protein is 50 kD, with a pl of 9.34. The size of the Arabidopsis U1-70K protein is close to the size of the animal U1-70K protein. A comparison of the deduced amino acid sequence from cDNA clone 23 with the published amino acid sequences of the U1-70K protein from humans (Theissen et al., 1986), Drosophila (Mancebo et al., 1990), Xenopus (Etzerodt et al., 1988), and yeast (Smith et al., 1989) indicated a significant sequence similarity between Arabidopsis and animal U1-70K proteins. Because the highest sequence similarity (64%) was found with the human U1-70K protein, we aligned the predicted amino acid sequence of the Arabidopsis protein with the human U1-70K protein (Figure 4).

The U1-70K protein and other RNA binding proteins contain an RRM domain of ~ 8 0 amino acids in length (Bandziulis et al., 1989). Within this RRM, two highly conserved amino acid segments (RNP1 and RNP2) are present. Deletion analysis with the human U1-70K protein demonstrated that the RRM is essential for binding of the U1-70K protein to the U1 snRNA (Query and Keene, 1987). In addition to the RRM, the C ter- mini of all metazoan U1-70K proteins have a glycine-rich region downstream of the RRM and two RS-rich regions character- ized byseveral RS dipeptides (Theissen et al., 1986; Etzerodt et al., 1988; Mancebo et al., 1990). The deduced amino acid sequence of the Arabidopsis U1-70K protein contains an RRM comprising RNP2 (amino acid residues 141 to 146) and RNPl (amino acid residues 179 to 186) consensus sequences and a glycinerich region. Although the C terminus of the Arabidopsis

RNA recognition mot if Arginine-rich domain

O 0 RNP consensus

regions

- 910 bp lncluded lntron I - Nuclear Localization Motif x Figure 5. Graphic Summary of the Plant U1-70K Protein

The RNA recognition motif and an arginine-rich domain are shown as shaded and hatched areas, respectively.

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

1426 The Plant Cell

U1-70K protein has an arginine-rich region, it does not sharestrong sequence similarity with the metazoan U1-70K protein(Figure 5).

Search of a Prosite data base at EMBL with the Arabidop-sis U1-70K sequence revealed the presence of potential nuclearsignals, including a bipartite motif (RRDRDRTRDRGDRDRRD)in the arginine-rich region. In addition, there are other nuclearlocalization signals (NLSs) in the deduced amino acid se-quence (Figure 4). Deletion analysis of the human U1-70Kprotein revealed two distinct NLS signals, one in the RRM(DGKKIDGRR) and the second in an arginine-rich region, thatare capable of targeting the protein to the nucleus (Romac etal., 1994). Seven of eight amino acids in the NLS located inthe RRM are conserved between Arabidopsis and human U1-70K proteins, suggesting that this region is a potential NLSin Arabidopsis U1-70K protein (Figure 4). The presence of sev-eral NLSs in Arabidopsis U1-70K strongly suggests that it isa nuclear protein.

cDNA clone 24 contains a long alternative included intronbetween exons 6 and 7 that interrupts the RRM between RNP2and RNP1 (Figure 5). This 910-bp region has an in-frame ter-mination codon resulting in a truncated protein that containsonly 204 amino acids, of which the last 28 amino acids areencoded by the included intron (Figure 2). Data base searcheswith this 28-amino acid peptide did not reveal any significanthomology with other proteins. Therefore, it is likely to producea truncated protein, with part of the RRM containing RNP2only. The functional significance, if any, of the larger transcriptand the truncated protein encoded by this cDNA is not known.It is of interest that a smaller (60 or 72 nucleotides) alterna-tively spliced included/excluded exon is found in the humanU1-70K gene at a different location within the RRM; this exonproduces a truncated protein due to the presence of an in-frame translation termination codon. However, unlike theArabidopsis U1-70K gene, the truncated protein from the hu-man U1-70K gene has both RNP2 and RNP1 consensussequences (Figure 4).

The U1-70K Gene Produces Two Major Transcripts ThatAre Differentially Expressed in Different Organs

The presence and abundance of two transcripts in differenttissues were examined by RNA gel blot analysis. As shownin Figure 6A, two distinct transcripts (1.7 and 2.8 kb) hybrid-ized with the full-length cDNA under high-stringencyconditions. The cDNAs isolated in this study are almost thesame size as the two hybridizing bands. The two transcriptsare differentially expressed in different tissues, resulting in avariable ratio of these transcripts in different plant organs (Fig-ure 6). The large transcript is more abundant in flowers,suspension cultures, and leaves, whereas a relatively high levelof the short transcript is detected in roots. Probing of the sameblot with ubiquitin indicates that an equal amount of RNA wasloaded in each lane (Figure 6B).

IoE

2.8

1.7

BIt

Figure 6. Expression of the U1-70K Gene in Different Tissues.

Eighty micrograms of total RNA was loaded in each lane.(A) Blot probed with cDNA clone 23. Numbers at left indicate the lengthof RNA markers in kilobases. Suspension, Arabidopsis suspensionculture.(B) Same blot probed with a ubiquitin cDNA.

To further confirm the presence of two transcripts, reversetranscriptase-polymerase chain reaction (RT-PCR) was per-formed with DNase-treated RNA by using primers (5' and 3')that specifically amplify the included intron and primers (43and 14) within exons 5 and 8 that amplify both of the transcripts(Figure 7A). Amplification of first-strand cDNA with 5' and 3'primers yielded a 910-bp product that hybridized with the in-cluded intron probe from clone 24, indicating the presenceof the transcripts containing the included intron in all of thetissues tested (Figure 7B). Furthermore, fragments of the ex-pected sizes, 0.23 and 1.14 kb (for short and long transcripts,respectively), were produced with callus RNA by using primers43 and 14 (Figure 7C, lane 1). The 1.14-kb product hybridizedwith both short cDNA and the included intron-specific probefrom clone 24 (Figure 7C, lanes 2 and 3), whereas the 0.23-kbproduct hybridized only with the short cDNA (clone 23) (Fig-ure 7C, lane 3). However, the amount of the amplified productsdoes not reflect the abundance of each transcript in the m RNA.The use of DNase-treated RNA samples without first-strandcDNA synthesis for RT-PCR failed to produce any amplifiedproducts. These results, together with the above-mentionedRNA gel blot data, confirm the presence of two distinct U1-70K transcripts in Arabidopsis.

Escherichia co//-Expressed Arabidopsis U1-70KProtein Is Recognized by the U1 RNP-SpecificMonoclonal Antibody

To determine the size of the protein produced by the short andlong transcripts and to perform in vitro binding studies withArabidopsis U1 snRNA, the cDNAs were expressed in £. coli.A 1.4-kb Hindlll-Apal fragment of cDNA clone 23 and a 2.5-kb

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

Alternative Splicing of U1 snRNP 70K Pre-mRNAs 1427

Hindlll-Smal fragment of cDNA clone 24 were expressed asHis-T7.Tag fusion proteins in pET28. The gene fusion fromcDNA clone 23 is predicted to produce a polypeptide of 50kD (45 kD from the cDNA plus 5 kD from His-T7.Tag), whereasthe fusion from cDNA clone 24 is expected to produce a poly-peptide of ~24 kD (19 kD from the cDNA plus 5 kD from theHis-T7.Tag) due to the presence of a translation terminationcodon within the included intron. Equal amounts of total pro-tein extract from induced and uninduced cultures wereseparated on an SDS-polyacrylamide gel (Figure 8A) andprobed with either T7.Tag- or U1 RNP-specific monoclonalantibodies (Figures 8B and 8C). Both antibodies detected twopolypeptides of ~65 and 70 kD in the protein extract from in-duced cells containing the cDNA 23 (Figures 8B and 8C, lanes2). These polypeptides were never detected in uninduced cul-tures. The size of the protein detected is larger than the sizeexpected from the predicted amino acid sequence. However,it is known that the presence of an RS-rich region in the ani-mal U1-70K protein results in abnormal migration of the protein(Spritz et al., 1987; Query et al., 1989). The higher molecularmass of the Arabidopsis U1-70K protein is probably also dueto the arginine-rich domain in the C-terminal region. The 65-kD

A «.

516|r 910 bpInc luded Intron J|7|8

B

910 bp .

1 2 3B1 2 3 1 2 3

125.679.647.7

28.1

19.4

Figure 8. Detection of the U1-70K Protein Expressed in £ caff by UsingT7,Tag Monoclonal Antibodies or U1 RNP-Specific Antibodies.Short (clone 23) and long (clone 24) cDNAs were expressed usingthe pET28b vector system, and total protein extract was used for im-munodetection. Lanes 1 contain protein from uninduced cells containingclone 23; lanes 2, protein from induced cells containing clone 23; lanes3, protein from induced cells containing clone 24. Total protein extractfrom induced and uninduced cultures was fractionated on SDS-con-taining polyacrylamide gels and either stained or blotted onto anitrocellulose membrane.(A) Coomassie Brilliant Blue R 250-stained gel.(B) Membranes probed with T7.Tag monoclonal antibodies.(C) Membranes probed with U1 RNP-specific monoclonal antibodies.Arrowheads indicate the specific proteins detected by antibodies. Mo-lecular mass markers are shown at left in kilodaltons.

protein is always present and is likely to be a proteolysis prod-uct. The protein of the expected size from cDNA clone 24 wasdetected only with T7.Tag monoclonal antibodies and not withU1 RNP-specific monoclonal antibodies (Figures 8B and 8C,lane 3).

C 1 2 31140 bp -

230 bp - r:Figure 7. Analysis of Alternative Splicing by RT-PCR.(A) Diagram showing the location of primers used in RT-PCR. Filledboxes (5 to 8) represent exons, and the open box represents includedintron. The arrows indicate the location of primers used in RT-PCR.(B) RT-PCR with 5' and 3' primers that amplify only the included in-tron. The amplified product was separated on an agarose gel, blotted,and hybridized with the labeled included intron-specific probe fromclone 24. The 910 bp at left indicates a DNA length marker. Suspen-sion, Arabidopsis suspension culture.(C) RT-PCR with primers 43 and 14. Callus RNA was used as a tem-plate. The amplified products were either stained (lane 1) or hybridizedwith the 910-bp included intron-specific probe from clone 24 (lane 2)or clone 23 (lane 3). Numbers at left indicate the length of DNA mark-ers in kilobases.

Isolation of U1 snRNAs from Arabidopsis

Two degenerate primers that are designed based on consensussequences of plant and animal U1 snRNAs (Solymosy andPollak, 1993) were used to amplify U1 snRNAs from Arabidop-sis genomic DNA. The amplified products were cloned intopBluescript KS+ and sequenced to confirm the identity of theclones. Figure 9 shows the nucleotide sequence and struc-ture of one of the isolated clones that was used for RNA-proteinbinding studies. This U1 snRNA contains 158 nucleotides andshows a high degree of conservation with other plant and ani-mal U1 snRNAs. The Arabidopsis U1 snRNA is ^70 to 75%identical with other plant U1 snRNAs. The Arabidopsis U1snRNA, like other plant and animal U1 snRNAs, folds into asecondary structure that consists of four stem-loops (I to IV)(Figure 9). The size of plant U1 snRNAs varies between 157and 165 nucleotides, and several variants are found in a givenspecies (Solymosy and Pollak, 1993).

In animals, the stem-loop I of U1 snRNA is the site of

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

1428 The Plant Cell

c ° I30-U A *

C 10 30-

GACCSUUCCGUUGUUUJUJUACUGC

c

C-140

II40

*cccuficucccuucc cc

«M . mFigure 9. Nucleotide Sequence and Secondary Structure of theArabidopsis U1 snRNA.The structure is generated by comparative analysis with published U1snRNA sequences from animals and other plants. Numbers I to IVrepresent stem-loop structures.

interaction of the U1-70K protein in which four nucleotides (27U,29A, 30U, and 31C) have been shown to be required for bind-ing of the U1-70K protein (Yuo and Weiner, 1989; Luhrmannet al., 1990). These nucleotides are present in the Arabidop-sis U1 snRNA. However, in many other plant U1 snRNAs, thenucleotide at position 27 is a C residue. It is also known thatmany plant U1 snRNAs either have shorter loop sequencesor lack one of the nucleotides required for binding (van Santenand Spritz, 1987b; Abel et al., 1989; Hanley and Schuler, 1991a,1991b; Vauxetal., 1992). Hence, it is possible that variant U1snRNAs bind to the U1-70K protein with different affinities. ADNA gel blot probed with the U1 snRNA revealed several bandswith three different restriction enzymes, suggesting that thereare multiple U1 snRNA genes in Arabidopsis (Figure 10). Thesegenes are most likely clustered in the genome because therestriction digests with rare cutting enzymes yielded only oneor two high molecular weight bands (data not shown). Prob-ing of total RNA with the isolated U1 snRNA showed that theyare abundantly expressed in several different tissues (data notshown).

The E. co//-Expressed Plant U1-70K Protein BindsU1 snRNA

To determine whether the isolated cDNA-encoded protein in-teracts with U1 snRNA, we performed in vitro RNA-proteinbinding studies using gel shift assays and protein-RNA gelblot analysis. For the gel shift assay, labeled U1 snRNA wasincubated with the purified E. co//-expressed U1-70K proteinor crude extracts of induced cultures containing the cDNA clone23. The RNA-protein complexes were then separated on na-tive gels. As shown in Figure 11A, incubation of differentamounts of purified protein (lanes 2 and 3) or crude extract(lane 5) of induced cultures containing the short cDNA retardedthe mobility of U1 snRNA. Extracts from uninduced cultures(lane 4) or BSA (lane 6) did not cause any retardation. Fur-

thermore, excess of cold U1 snRNA in the assays abolishedthe mobility shift, indicating the specificity of U1 snRNA bind-ing to protein (data not shown).

To characterize further the binding of U1 snRNA to proteinfrom clones 23 and 24, we performed protein-RNA gel blotanalysis. Protein blots containing the total protein extract fromuninduced and induced cultures containing the short or longcDNA were probed with labeled U1 snRNA. The same two pep-tides (65 and 70 kD) that were detected with U1 RNP-specificantibodies in induced cultures (Figure 8) bound specificallyto U1 snRNA (Figure 11B, lane 2). Protein-RNA gel blots usingthe fusion protein from cDNA clone 24 did not show any spe-cific binding to U1 snRNA (Figure 11B, lane 3). This result islikely due to an incomplete RRM without an RNP1 consensussequence in the truncated protein encoded by cDNA clone24. BSA that was used as a negative control did not show anybinding (Figure 11B, lane 4). These results clearly indicate thatthe protein product from clone 23 only binds directly and spe-cifically to the Arabidopsis U1 snRNA. These results prove thatthe cDNA-encoded protein is a true homolog of the U1-70Kprotein.

DISCUSSION

Comparison of the Arabidopsis U1-70K Protein with theHuman U1-70K Protein Reveals Similarities andDifferences

The human U1-70K protein has several domains that are struc-turally and functionally important, as determined by sequence

HindHI EcoRI BamHI

23.1-

9.41 _6.56-

4.36-

2.02-

0.56-

Figure 10. DNA Gel Blot Characterization of the Arabidopsis U1snRNA.Genomic DNA was digested with Hindlll, EcoRI, and BamHI. A PCR-amplified DNA fragment of the U1 snRNA was used as a probe. Num-bers at left indicate the length of DNA markers in kilobases.

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

Alternative Splicing of U1 snRNP 70K Pre-mRNAs 1429

1 2 3 4 5 6complex

free probe -

B1 2 3 4

79.6

47.7

28.1

19.4

Figure 11. Binding of the U1-70K Protein Produced from the ShortcDNA (Clone 23) of Arabidopsis U1 snRNA.(A) Mobility shift assay. Radiolabeled U1 snRNA was incubated withprotein (lanes 2 to 6). The RNA-protein complex was separated byelectrophoresis in native gel. Lane 1 contains probe alone without pro-tein; lane 2, 1 ng of purified U1-70K protein; lane 3, 3 ng of purifiedU1-70K protein; lane 4, 30 ng of total protein extract from uninducedE. co// cells; lane 5, 30 ng of total protein extract from induced cells(clone 23); lane 6, 5 ng of BSA.(B) Autoradiography of a protein-RNA gel blot. Thirty micrograms oftotal protein extract from uninduced (lane 1) and induced cells con-taining clone 23 (lane 2), clone 24 (lane 3), and 5 |ug of BSA (lane4) was resolved by SDS-PAGE, transferred to a nitrocellulose mem-brane, and probed with the labeled U1 snRNA. The arrowheads indicateU1 snRNA binding to fusion protein from cDNA (clone 23). Molecularmass markers are shown at left in kilodaltons.

(Woppmann et al., 1993; Romac et al., 1994). The RS-richregions contain several dipeptide repeats of RD/RE and RS.Because the Arabidopsis U1-70K protein shows the highestsequence similarity with its human counterpart, we analyzedthe predicted sequence of the Arabidopsis U1-70K protein forthe presence of various structural and functional domains pres-ent in human U1-70K protein. We found that some of thefunctional domains that have been characterized in vertebrateU1-70K proteins are present in the Arabidopsis U1-70K pro-tein. These include RRM with two consensus sequences(RNP1 and RNP2), an NLS in the RRM, a glycine-rich regiondownstream of the RRM, and an arginine-rich region at theC terminus of the protein (Figure 4). Similar charged regions,named RS-rich domains, have been found in a number of splic-ing factors, including alternative splicing factor/splicing factor2 (Ge et al., 1991; Krainer et al., 1991), U2AF65 (Zamore et al.,1992), U2AF35 (Zhang et al., 1992), SC35 (Fu and Maniatis,1992), and Drosophila transformer (tra) (Boggs et al., 1987),transformer-2 (tra-2), and suppressor-of-white-apricot (su[w]a)products (Goralski et al., 1989), which are known regulatorsof RNA splicing.

The presence of two distinct RS-rich regions in the C termi-nus is not obvious in the Arabidopsis U1-70K protein. Inaddition, the arginine-rich region in Arabidopsis contains mostlyRD/RE dipeptides (21 repeats). This may reflect some varia-tion in the function of plant U1-70K, because this region in theanimal protein contains more RD/RE dipeptides (25 repeats)and RS dipeptides (11 repeats) that interact with similar RS-regions in splicing factors. The human U1-70K protein is phos-phorylated at multiple sites in the C-terminal RS region, therebygenerating 13 different variants (Woppmann et al., 1993). Theserine residues in the RS region are the sites of phosphoryla-tion. It has also been shown that phosphorylation of U1-70Kplays an important role in pre-mRNA splicing (Tazi et al., 1993).The fact that the arginine-rich region in the Arabidopsis U1-70K protein does not contain many RS dipeptides suggeststhat either it is not a phosphoprotein or it is not as heavily phos-phorylated as in the human U1-70K protein. The presence ofan arginine-rich region in Arabidopsis also suggests that thisregion may be involved in interacting with splicing factorsimplicated in basic and alternative splicing. Neither the gly-cine/proline-rich region between two RS domains nor theglycine region C terminus of the second RS domain is pres-ent in Arabidopsis U1-70K. Overall, the Arabidopsis U1-70Kprotein shares certain features with the animal U1-70K pro-tein but differs in other respects. These differences may reflectvariations in pre-mRNA splicing between plants and animals.

comparison and deletion analysis. These include (1) an RRMcontaining RNP2 and RNP1 consensus sequences and anNLS, (2) a glycine-rich region next to the RRM, (3) two RS-richregions (residues 231 to 310 and 350 to 383), (4) a glycine/pro-line-rich region in front of the second RS-rich region, and (5)a glycine-rich region downstream of a second RS-rich region

The Structural Organization of the Plant U1-70K GeneIs Different from That of Vertebrate U1-70K Genes

The general structure of U1-70K genes has been conservedthrough vertebrate evolution. Although there are many similar-ities between the human and Arabidopsis U1-70K genes, thereare also obvious striking differences. The Arabidopsis U1-70K

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

1430 The Plant Cell

gene is m3.5 kb, whereas the human U1-70K gene is much longer (44 kb) and contains one additional exon. The alterna- tively spliced included intron is much longer (910 bp) in Arabidopsis than in humans (60172 bp). The position of the included intron is also different in Arabidopsis and animal U1- 70K genes. Furthermore, the large transcript is produced due to skipping 5’ and 3’splice sites flanking the 910-bp included intron. The sequences at the 5’and 3’splice sites flanking the included intron are in good agreement with the consensus se- quences. This kind of alternative splicing has also been found in the genes encoding fibronectin, the platelet-derived growth factor A chain and Drosophila P element transposase (Laski et al., 1986; Collins et al., 1987; Schwarzbauer et al., 1987). However, in humans, the use of three alternative 5’splice sites in conjunction with the common 3’splice site results in inclu- sion/exclusion of a 60/72-bp region in the transcript. The overall exon-intron organization of the U1-70K gene is identical in hu- mans, Xenopus, and mouse. The nucleotide sequence of alternative included/excluded exon containing an in-frame stop codon along with the sequences in the downstream intron is completely conserved in vertebrates.

Two Transcripts Are Produced from a Single U1-70K Gene by Alternative Splicing

Results from genomic DNA gel blots, sequence analysis of the cDNA and genomic clones, and chromosomal mapping of the gene clearly indicate that the U170K protein is encoded by a single gene in Arabidopsis. Furthermore, detailed analy- sis of severa1 cDNA clones and a genomic clone suggests that alternative splicing of the gene is responsible for the produc- tion of two distinct transcripts. This is confirmed by RNA gel blot analysis showing two distinct transcripts of the expected length (Figure 6A). The fact that one-third of the isolated cDNAs are of the clone 24 type suggests that this transcript is highly expressed. Two partia1 cDNAs of Arabidopsis expressed se- quence tags sequences (GenBank accession numbers g b lT45822 19085 and gb(T449561829) containing the included intron indicating the presence of the long transcript in the mRNA population. Data from RT-PCR using primers that spe- cifically amplify either the long transcript or both of the transcripts (Figure 7) also confirmed the presence of both tran- scripts in various tissues. In humans, two mRNAs of 4 . 7 and 3.9 kb were also found in cultured cells. However, the 1.7-kb mRNA is 40-fold more abundant than the 3.9-kb transcript (Spritz et al., 1987). The length of the included/excluded exon is less than 100 bases; hence, it is not clear how the longer transcripts are generated in animals. In Drosophila, two tran- scripts of 1.9 and 3.1 kb hybridized with the U1-70K cDNA. The shorter transcript is abundant, and the ratio of the two tran- scripts is developmentally regulated (Mancebo et al., 1990).

The functional effects of alternative splicing on the encoded protein products can be at the leve1 of activity (loss, gain, or modification) or cellular localization of the protein. Although the function of the protein encoded by the larger transcript with

the included intron remains to be determined, its expression in various tissues and regulation of the relative abundance suggest a role for this transcript. It is possible that the long transcripts with the included intron may be performing one or more functions. First, production of a truncated protein may regulate the,amount of functional protein in the cell. Second, the truncated protein may be functional and associate with U1 snRNP in vivo but have a different function from the full-length protein. Recent reports from studies with yeast and humans suggest this possibility (Nelissen et al., 1994; Hilleren et al., 1995). The N-terminal region (amino acids 1 to 97) of human U1-70K that lacks the RRM associates efficiently with core U1 snRNPs (Nelissen et al., 1994), although this protein is not known to interact directly with U1 snRNA under in vitro condi- tions. In yeast, it has been shown recently that the N-terminal region of ~ 9 2 amino acids that does not contain an RNA rec- ognition motif but can associate with U1 snRNP in vivo is necessary and sufficient for U1-70K function (Hilleren et al., 1995). Furthermore, the truncated protein encoded by the large transcript shows significant sequence similarity (40% iden- tity and 60% similarity) with the 97-amino acid N-terminal region of human and yeast U1-70K, which is shown to associ- ate with U1 snRNP.

The fact that plants, including Arabidopsis, have multiple U1 snRNAs that show variation in size and nucleotide se- quence in stem-loop I suggests that different U1 snRNAs may have different affinities for U1 snRNP proteins, including U1- 70K. Differential interaction of plant U1 snRNAs with U1-70K protein may have important implications in pre-mRNA splicing.

Alternative splicing of pre-mRNAs encoding certain proteins is fairly common in animals. It is one of the important mecha- nisms in regulating gene expression and controlling development (Green, 1991; Guthrie, 1991). For instance, sex determination in Drosophila is controlled by alternative splic- ing of genes (Smith et al., 1989; Tian and Maniatis, 1992). However, very few examples of alternative splicing are known in plants. These include ribulose-l,5-bisphosphate carboxy- lase/oxygenase activase (Werneke et al., 1989), RNA polymerase II (Dietrich et al., 1990), RNA binding protein-1 (Hirose et al., 1993), chorismate synthase (Gorlach et al., 1995), a rice homeobox gene (Tamaoki et al., 1995), and H protein (Kopriva et al., 1995). Alternative splicing of some pre-mRNAs is due to utilization of different 5’ splice sites and a common 3’splice site (Werneke et al., 1989; Hirose et al., 1993; Gorlach et al., 1995). whereas in others it is due to utilization of a com- mon 5‘ splice site and different 3‘ splice sites (Grotewold et al., 1991). In the case of Arabidopsis U1-70K, generation of a large transcript is due to skipping of both 5’and 3’splice sites of a long (910 bp) intron.

The Arabidopsis U1-70Kgene is the only plant U1-70K gene that has been characterized, providing the only evolutionary comparison between plant and non-plant (yeast and animal) U1-70K proteins. The availability of full-length cDNAs should permit investigations of the role of U1-70K in basic and alter- native splicing by manipulating the expression of this gene transiently or stably in a tissue-specific manner and analyz-

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

Alternative Splicing of U1 snRNP 70K Pre-mRNAs 1431

ing the splicing of pre-mRNAs that are known to undergo basic and alternative splicing. Furthermore, it is also possible to study the interaction of U1-70K with other proteins by using avariety of approaches.

METHODS

Plant Material

Plants (Arabidopsis thaliana ecotype Columbia) were grown at 22OC on a mixtureof peat-perlite-vermiculite (1:l:l) under continuous light. Six-week-old plants were used to collect flowers, fruits, leaves, and stems. Roots were grown in liquid culture and harvested, as described by Reddy et al. (1994). Suspension culture cells and callus were main- tained as described by Day et al. (1996).

Screening of cDNA and Genomic Libraries

A cDNA library of flower buds constructed in hZAPll vector was scresned with a partial U1 small nuclear ribonucleoprotein particle (snRNP) 70K (Ul-70K) cDNA (pASNP1) isolated previously (Reddy et al., 1992). Approximately 5 x 105 plaques were screened by using 3zP-labeled pASNP1, according to Sambrook et al. (1989). Thirty-two positives detected in the first screening were plaque purified by two additional rounds of screening. The cDNAs from phage recombinants were excised in vivo in a plasmid form (pBluescript SK+), according to the instructions provided by Stratagene.

To isolate genomic clones, an amplified genomic library prepared in the EMBL3 vector was screened as described above by using the partial cDNA as a probe. Screening of 4 x 105 plaques resulted in isolation of 46 clones. Phage DNA from several genomic clones was isolated and analyzed with several restriction enzymes. A genomic clone that contained the complete gene was subcloned and used for further analysis.

DNA and RNA Gel Blot Analyses

Genomic DNA was isolated from leaves and stem by using urea- phenol-containing buffer. Seven micrograms of DNA was digested with different restriction endonucleases, electrophoretically separated in 0.8% agarose gel, and transferred to a nylon membrane (Hybond N; Amersham),. The DNA was cross-linked to the filter by exposing it to UV in a Stratalinker (Stratagene). Radioactive probes were synthesized with an oliogolabeling kit from Amersham by using nanonucleotide random primers. Prehybridization and hybridization were done using the Quick Hybridization solution (Amersham), according to instruc- tions provided by the supplier. The filters were washed under high-stringency conditions and exposed to x-ray film.

RNA from different organs and tissues was isolated and purified on aCsCl cushion (Sambrooket al., 1989). RNA pellets weredissolved in either deionized formamide (for RNA gel blots) or HNase-free water (for reverse transcriptase-polymerase chain reaction [RT-PCR]). The RNA from different plant organs was separated in a 1% formaldehyde- containing agarose gel, transferred to a Hybond N+ (Amersham) mem- brane, and cross-linked by using UV light. A radioactive probe was prepared as described above. Hybridization was performed by using Quick Hybridization solution, according to the instructions provided by the manufacturer.

Restriction Fragment Length Polymorphisrn Mapping

A genomic (h 5) clone that showed polymorphism between the Lands- berg erecta and Columbia ecotypes was used for mapping the position of the U1-70K gene. The mapping filters prepared with Xbal were probed with the ;I clone and scored for Xbal polymorphism between Lands- berg erecta and Columbia ecotypes on 100 individuals. The data were analyzed with Mapmaker Macintosh version 1.0 software (Chang et al., 1988).

DNA Sequencing

Both strands of several cDNA clones and a genomic clone were se- quenced by the dideoxy nucleotide chain termination method, using double-stranded DNA as a template. Subclones and primer walking were used to obtain complete sequences of the clones. Sequence anal- ysis was performed using the Sequencher and MacVector (International Biotechnologies Inc., New Haven, CT) sequence analysis software. Data base searches were performed at the National Center for Biotech- nology lnformation by using the BLAST network service provided by the National Library of Medicine (Bethesda, MD). Alignment of pro- tein sequences was done using the PILEUP program (Genetics Computer Group, Madison, WI).

RT-PCR

Total RNA was treated with DNase before being used as a template for RT-PCR. One microgram of total RNA was routinely used to syn- thesize first-strand cDNA with an oligo(dT) primer and avian myeloblastosis virus reverse transcriptase in a 20-pL volume (Reddy et al., 1996b). 5'(TACGTAGGTTATTCTGAGCATTCC) and 3(GTTAAC- CTGAGAAGTTTCAGAAGA) primers specific to the included intron were used to amplify the 910-bp included intron, whereas 43 (CCCAATAAT- GATCCAAATGC) and 14 (CCATCAGCTTGCTTATATGC) primers were used to amplify short (230 bp) and long (1140 bp) transcripts. The PCR reactions were performed in a final volume of 50 pL. The reaction mix- ture was preheated to 94OC for 5 min and cooled to 55OC, and Taq polymerase was added to initiate the amplification reaction. Twenty- five or 35 cycles of amplification were performed in a Perkin-Elmer (Emeryville, CA) thermal cycler. Each amplification cycle consisted of 1 min of denaturation at 94OC, 1 min of annealing at 53OC, and 2 min of extension at 72OC. The amplified products were resolved by elec- trophoresis in 1.5% agarose gels. Amplification of RNA without first-strand cDNA synthesis did not yield any amplified products. To confirm the identity of the PCR-amplified product, the gel was blotted and probed with 32P-labeled DNA of either the 910-bp included intron fragment of cDNA clone 24 ora 1.6-kb (Apal-Smal) fragment of cDNA clone 23. Primers corresponding to a constitutively expressed cyclophi- lin gene (ROC1) were used as an interna1 control to demonstrate that we used equal amounts of first-strand cDNA in all reactions (Lippuner et al., 1994; Day et al., 1996).

Protein Expression in Escherichia coli

cDNA clone 23 was digested with Apal, and the ends were blunt ended with T4 polymerase and digested with Hindlll to release the Hindlll blunt-ended fragment. The pET28b plasmid with a Hindlll site and a blunt end was prepared by digesting the plasmid with Xhol and blunt- ing the ends, followed by a second digestion with Hindlll. The cDNA

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

1432 The Plant Cell

fragment (Hindlll and blunt ended) was then inserted in frame into the pET28b plasmid that contained a Hindlll site and a blunt end. A 2.5-kb (Hindlll-Smal) fragment from clone 24 was cloned into the pET28b plasmid (Hindlll and blunt ended), prepared as given above. Each of these constructs was introduced into E. coli BL21(DE3) and grown in Luria-Bertani medium containing 30 pglmL kanamycin. Soluble and insoluble protein from induced and uninduced cultures was prepared as described by Reddy et al. (1996a). Analysis of the soluble and pel- let fractions of total protein extracts resulted in the recovery of a fusion protein exclusively in the pellet fractions, suggesting that the fusion protein is present in inclusion bodies. The protein was solubilized with sarkosyl- or urea-containing buffers and renatured by extensive dialy- sis to remove detergent.

Protein Gel Blot Analysis

Fusion proteins were separated on a 12% SDS-polyacrylamide gel and transblotted onto a nitrocellulose membrane by using a Bio-Rad transfer cell. The filters were blocked for 1 hr at 30°C in TBST (10 mM Tris-HCI, pH 8.0, 150 mM NaCI, and 0.1% Tween-20) containing 3% gelatin. After rinsing with TBST, the filters were incubated with pri- mary antibodies (undiluted U1 RNP-specific monoclonal antibodies ora 1:10,000 dilution of T7.Tag monoclonal antibodies in TBST) for 30 min. The filters were then washed and incubated with asecondary antibody coupled to horseradish peroxidase. lmmunoreactive bands were detected colorometrically by immersing the filters in a substrate solution (0.8 mg/mL diaminobenzidine, 0.4 mg/mL NiCIZ, and 0.009% H202 in 100 mM Tris-HCI, pH 7.5).

lsolation of U1 snRNA Genes

Degenerate sense (Ul-S, 5‘-CTGGGTACCATACTTACCTGGA[CT]GGG- TC[AGT][AT]T[GAT]G-3? and antisense (UlA, 5‘-TTGGAGCTCG[AG]- [ACT][AG]GGGGC[CT]GCGCGAAC[AG]CAGG-3? primers correspond- ing to the U1 small nuclear RNA (snRNA) were designed based on published plant and animal U1 snRNAs (Solymosy and Pollák, 1993) and used to amplify corresponding sequences from Arabidopsis. Sense and antisense primers contained, in addition to U1 snRNA sequences, restriction sites for Sacl and Kpnl, respectively. Genomic DNA (100 ng) was preheated to 70°C for 10 min before adding the reaction com- ponents. Thirty-five cycles of amplification were performed (94OC for 1 min, 45OC for 1 min, and 72OC for 1 min), with a final extension step at 72°C for 3 min. PCR products were purified by extracting with phe- nol-chloroform and precipitated with ethanol. After Sacl-Kpnl digestion, the DNA was separated by agarose gel (20/0) electrophoresis, and a fragment of the expectsd length (160 bp) was extracted from the gel (Favre, 1992) and subcloned into Sacl-Kpnl sites of plasmid vector pBluescript KS+ (Stratagene). 60th strands of the inserts of several clones were sequenced using M13 reverse and -20 primers.

was used to label U1 snRNA. The reaction product was purified by passing it through a Sephadex G25 column (Pharmacia) and precipitat- ing the RNA with ethanol (Sambrook et al., 1989). The labeled U1 snRNA precipitate was dissolved in 100 pL of RNase-free water con- taining 2 pM of yeast tRNA and 20 to 40 units of RNasin (RNase inhibitor) and either used immediately or stored at -80°C until use.

Mobility Shift Assay

Total protein extract of induced and uninduced cultures of E. colicells containing the short (clone 23) or long (clone 24) cDNA was prepared as given above. Binding of U1 snRNA to protein was performed in a reaction volume of 20 pL containing KHN buffer (20 mM Hepes, pH 7.9, 100 mM KCI, 0.05% Nonidet P-40), 1 pL of 32P-labeled U1 snRNA, and total protein extract from E . coli cells or the solubilized U1-70K protein fraction. The reaction mixture was incubated for 1 hr at 27% and the RNA-protein complex was separated on native polyacrylamide gels (5%) by using either glycine or TBE (45 mM Tris-borate, 1 mM EDTA) buffer. The gel was dried and exposed to x-ray film.

Protein-RNA Gel Blot Analysis

Proteins expressed in E. coli were fractionated in 10% SDS-polyacryl- amide gels and transferred to a nitrocellulose membrane. The filter was blocked overnight at 4°C in blocking buffer (5 x Denhardt’s solu- tion [ l x Denhardt’s solution is 0.02% Ficoll, 0.02% PVP, 0.02% BSA] in 10 mM Tris-HCI, pH 7550 mM NaCI, 1 mM EDTA) and then incubated in 10 mL of KHN buffer containing labeled U1 snRNA (2 x 105 cpm) for 1 hr at 28°C with gentle agitation (30 rpm). The membrane was washed several times (20 min each time) with KHN buffer in blocking buffer without Denhardt’s solution. The wet filter was covered with Saran Wrap and exposed to x-ray film.

ACKNOWLEDGMENTS

We are grateful to Elliot Meyerowitz and Leonard Meldrano (California lnstitute of Technology, Pasadena, CA) for restriction fragment length polymorphism mapping of the U1-70K gene and to Dr. Gary Zieve (State University of New York at Stony Brook) for U1 RNP monoclonal anti- bodies. We thank Patricia Bedinger, lrene Day, and Jonathan Bowser for critical reading of the manuscript and Eugenya Guseynova for help with manuscript preparation. Data base searches were performed at the National Center for Biotechnology lnformation by using the BLAST network service. This work was supported by a U.S. Department of Agriculture National Research lnitiative Competitive grant (No. 9400849) to A.S.N.R.

Received March 25, 1996; accepted May 30, 1996.

In Vitro Transcription of U1 snRNA

One of the U1 snRNA clones was digested with Sacl, the ends were blunted by T4 polymerase, and the cut plasmid was separated on an agarose gel. The plasmid DNA was purified and used as a template for an in vitro transcription reaction. U1 snRNA transcripts were syn- thesized with T3 RNA polymerase by using an in vitro transcription kit (Boehringer Mannheim). Fifty microcuries of 32P-UTP (Du Pont)

REFERENCES

Abel, S., Kiss, T., and Solymosy, F. (1989). Molecular analysis of eight U1 RNA gene candidates from tomato that could potentially be tran-

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

Alternative Splicing of U1 snRNP 70K Pre-mRNAs 1433

scribed into U1 RNA sequence variants differing from each other in similar regions of secondary structure. Nucleic Acids Res. 17,

Andersen, J., and Zieve, G.W. (1991). Assembly and intracellular trans- port of snRNP particles. Bioessays 13, 57-64.

Bandziulis, R.J., Swanson, M.S., and Dreyfuss, G. (1989). RNA- binding proteins as developmental regulators. Genes Dev. 3,431-437.

Black, D.L., Chabot, E., and Steitr, J.A. (1985). U1 as well as U1 small nuclear ribonucleoproteins are involved in premessenger RNA splicing. Cell 42, 737-750.

Boggs, R.T., Gregor, S., Idriss, S., Belote, J.M., and McKeown, M. (1987). Regulation of sexual differentiation in Drosophita melano- gaster via alternative splicing of RNA from the rransformer gene. Cell 50, 739-747.

Brown, J.W.S., Feix, G., and Frendewey, D. (1986). Accurate in vitro splicing of two pre-mRNA plant introns in a HeLa cell nuclear ex- tract. EMBO J. 5, 2749-2758.

Cavener, D.R., and Ray, S.C. (1991). Eukaryotic start and stop trans- lation sites. Nucleic Acids Res. 19, 3185-3192.

Chang, C., Bowman, J.L., DeJohn, A.W., Lander, E.S., and Meyerowitz, E.M. (1988). Restriction fragment length polymorphism linkage map for Arabidopsis rhaliana. Proc. Natl. Acad. Sci. USA

Collins, T., Bonthron, D.T., and Orkin, S.H. (t987). Alternative RNA splicing affects function of encoded platelet derived growth factor A chain. Nature 328, 621-624.

Day, I., Reddy, A.S.N., and Golovkin, M. (1996). lsolation of a new mitotic cyclin from Arabidopsis: Complementation of a yeast cyclin mutant with a plant cyclin. Plant MOI. Biol. 30, 565-575.

Dietrich, M.A., Prenger, J.P., and Guilfoyle, T.J. (1990). Analysis of the genes encoding the largest subunit of RNA polymerase II in Arabidopsis and soybean. Plant MOI. Biol. 15, 207-223.

Egeland, D.B., Sturtevant, A.P., and Schuler, M.A. (1989). Molecu- lar analysis of dicot and monocot small nuclear RNA populations. Plant Cell 1, 633-643.

Etzerodt, M., Vignali, R., Ciliberto, G., Scherly, D., Mattaj, I.W., and Philipson, L. (1988). Structure and expression of a Xenopus gene encoding an snRNP protein (U1 70K). EMBO J. 7, 4311-4321.

Favre, D. (1992). lmproved phenol-based method for the isolation of DNA fragments from low melting temperature agarose gels. BioTech- niques 13, 25-26.

Fu, X.-D., and Maniatis, T. (1992). lsolation of a complementary DNA that encodes the mammalian splicing factor SC35. Science 256, 5 3 5 - 5 3 8.

Ge, H., and Manley, J.L. (1991). A protein factor, ASF, controls cell-spe- cific alternative splicing of SV40 early pre-mRNA in vitro. Cell 62,

Ge, H., Zuo, P., and Manley, J.L. (1991). Primary structure of the hu- man splicing factor ASF reveals similarities with Drosophila regulators. Cell 66, 373-382.

Goodall, G.J., and Filipowicz, W. (1989). The AU-rich sequences pres- ent in the introns of plant nuclear pre-mRNAs are required for splicing. Cell 58, 473-483.

Goodall, G.J., and Filipowicz, W. (1991). Different effects of intron nucleotide composition and secondary structure on pre-mRNA splic- ing in monocot and dicot plants. EMBO J. 10, 2635-2644.

Goodall, G.J., Kiss, T., and Filipowiw, W. (1991). Nuclear RNA splicing and small nuclear RNAs and their genes in higher plants. Oxf. Surv. Plant MOI. Cell Biol. 7, 255-296.

639-6337,

85, 6856-6860.

25-34.

Goralski, T.J., Edst rh , J.-E., and Baker, B.S. (1989). The sex de- termination locus transformer-2 of Drosophila encodes a polypeptide with similarity to RNA binding proteins. Cell 56, 1011-1018.

Gorlach, J., Raesecke, H.-R., Abel, G., Wehrli, R., Amrhein, N., and Schmid, J. (1995). Organ-specific differences in the ratio of alternatively spliced chorismate synthase (LeCS2) transcripts in tomato. Plant J. 8, 451-456.

Green, M.R. (1991). Biochemical mechanisms of constitutive and regu- lated pre-mRNA splicing. Annu. Rev. Cell Biol. 7, 559-599.

Grotewold, E., Athma, P., and Peterson, 1. (1991). Alternatively spliced products of the maize P gene encode proteins with homology to the DNA-binding domain of Myb-like transcription factors. Proc. Natl. Acad. Sci. USA 88, 4587-4591.

Guthrie, C. (1991). Messenger RNA splicing in yeast: Clues to why the spliceosome is a ribonucleoprotein. Science 253, 157-163.

Hanley, B.A., and Schuler, M.A. (1991a). cDNA cloning of U1, U2, U4 and U5 snRNA families expressed in pea nuclei. Nucleic Acids Res. 19, 1861-1869.

Hanley, B.A., and Schuler, M.A. (1991b). Developmental expression of plant snRNAs. Nucleic Acids Res. 19, 639-6325,

Hartmuth, K., and Barta, A. (1986). In vitro processing of a plant pre- mRNA in HeLa cell nuclear extract. Nucleic Acids Res. 14,

Heinrichs, V., Bach, M., Winkelmann, G., and Liihrmann, R. (1990). U1-specific protein C needed for efficient complex formation of U1 snRNP with a 5' splice site. Science 247, 69-72.

Hilleren, P.J., Kao, H.Y., and Siliciano, P.G. (1995). The amino-terminal domain of yeast U1-70K is necessary and sufficient for function. MOI. Cell. Biol. 15, 6341-6350.

Hirose, T., Sugita, M., and Suguira, M. (1993). cDNA structure, ex- pression and nucleic acid-binding properties of three RNA-binding proteins in tobacco: Occurrence of tissue specific alternative splic- ing. Nucleic Acids Res. 21, 3981-3987.

Hornig, H., Fischer, U., Costas, M., Rauh, A., and Liihrmann, R. (1989). Analysis of genomic clones of the murine U1 RNA-associated 70-kDa protein reveals a high evolutionary conservation of the pro- tein between human and mouse. Eur. J. Biochem. 182, 45-50.

Hunt, A.G. (1994). Messenger RNA S'end formation in plants. Annu. Rev. Plant Physiol. Plant MOI. Biol. 45, 47-60.

Kenan, D.J., Query, C.C., and Keene, J.D. (1991). RNA recognition: Towards identifying determinants of specificity. Trends Biochem. Sci.

Kohtz, J.D., Jamison, S.F., Will, C.L., Zuo, P., Liihrmann, R., Garcia- Blanco, M.A., and Manley, J.L. (1994). Protein-protein interactions and 5'-splice-site recognition in mammalian mRNA precursors. Na- ture 368, 119-124.

Kopriva, S., Cossu, R., and Bauwe, H. (1995). Alternative splicing results in two different transcripts for H-protein of the glycine cleav- age system in the C4 species flaveria trinervia. Plant J. 8,435-441.

Kozak, M. (1991). An analysis of vertebrate mRNA sequences: Inti- mations of translational control. J. Cell Biol. 115, 887-903.

Krainer, A.R., Conway, G.C., and Kozak, D. (1990). Purification and characterization of pre-mRNA splicing factor SF2 from HeLa Cells. Genes Dev. 4, 1158-1171.

Krainer, A R , Mayeda, A., Kozak, D., and Binns, G. (1991). Func- tional expression of cloned human splicing factor SF2: Homology to RNA-binding proteins, U1 70K and Drosophila splicing regula- tors. Cell 66, 383-394.

751 3-7528.

16, 214-220.

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

1434 The Plant Cell

Kuo, H., Nasim, F.H., and Grabowski, P.J. (1991). Control of alterna- tive splicing by differential binding of U1 small nuclear ribonucleoprotein particles. Science 251, 1045-1050.

Laski, F.A., Rio, D.C., and Rubin, G.M. (1986). Tissue specificity of Drosophila P element transposition is regulated at the leve1 of mRNA splicing. Cell 44, 7-19.

Lippuner, V., Chou, I.T., Scott, S.V., Ettinger, W.F., Theg, S.M., and Gasser, C.S. (1994). Cloning and characterization of chloroplast and cytosolic forms of cyclophilin from Arabidopsis thaliana. J. Biol. Chem.

Lou, H., McCullough, A.J., and Schuler, M.A. (1993). 3'Splice site selection in dicot plant nuclei is position dependent. MOI. Cell. Biol.

Luehrsen, K.R., and Walbot, V. (1994a). Addition of A- and U-rich sequences increases the splicing efficiency of a deleted form of a maize intron. Plant MOI. Biol. 24, 449-463.

Luehrsen, K.R., and Walbot, V. (1994b). lntron creation and polyad- enylation in maize are directed by AU-rich RNA. Genes Dev. 8, 1117-1130.

Luehrsen, K.R., Taha, S., and Walbot, V. (1994). Nuclear pre-mRNA processing in higher plants. Prog. Nucleic Acids Res. 47, 149-193.

Lührmann, A. (1988). snRNP proteins. In Structure and Function of Major and Minor Small Nuclear Ribonucleoprotein Particles, M.L. Birnstiel, ed (Berlin: Springer-Verlag), pp. 71-99.

Lührmann, R., Kastner, B., and Bach, M. (1990). Structure of spliceosomal snRNPs and their role in pre-mRNA splicing. Biochim. Biophys. Acta 1087, 265-292.

Mancebo, R., Lo, P.C.H., and Mount, S.M. (1990). Structure and ex- pression of the Drosophila melanogaster gene for the U1 small nuclear ribonucleoprotein particle 70K protein. MOI. Cell. Biol. 10,

McCullough, A.J., Lou, H., and Schuler, M.A. (1993). Factors affecting authentic 5'splice site selection in plant nuclei. MOI. Cell. Biol. 13,

Mount, S.M., Pettersson, I . , Hinterberger, M., Karmas, A., and Steitz, J.A. (1983). The U1 small nuclear RNA-protein complex selec- tively binds a 5' splice site in vitro. Cell 33, 509-518.

Musci, M.A., Egeland, D.B., and Schuler, M.A. (1992). Molecular comparison of monocot and dicot U1 and U2 snRNAs. Plant J. 2, 5 8 9 - 5 9 9.

Nelissen, R.L.H., Will, C.L., van Venrooij, W.J., and Lührmann, R. (1994). The association of the U1-specific 70K and C proteins with U1 snRNPs is mediated in part by common U snRNP proteins. EMBO

Proudfoot, N.J., and Brownlee, G.G. (1976). 3' Non-coding region se- quences in eukaryotic messenger RNA. Nature 263, 211-214.

Query, C.C., and Keene, J.D. (1987). A human autoimmune protein associated with U1 RNA contains a region of homology that is cross- reactive with retroviral p30gag antigen. Cell 51, 211-220.

Query, C.C., Bentley, R.C., and Keene, J.D. (1989). A common RNA recognition motif identified within a defined U1 RNA binding do- main of the 70K U1 snRNP protein. Cell 57, 89-101.

Reddy, A.S.N., Czernik, A., An, G., and Poovaiah, B.W. (1992). Clon- ing of the cDNA for U1 small nuclear ribonucleoprotein particle 70K protein from Arabidopsis thaliana. Biochim. Biophys. Acta 1171,

Reddy, A.S.N., Safadi, F., Beyette, J., and Mykles, D.L. (1994). Calcium-dependent proteinase activity in root cultures of Arabidopsis. Biochim. Biophys. Res. Commun. 199, 1089-1095.

269, 7863-7868.

13, 4485-4493.

2492-2502.

1323-1331.

J. 13, 4113-4125.

88-92.

Reddy, A.S.N., Safadi, F., Narasimhulu, S.B., Golovkin, M., and Hu, X. (1996a). A nove1 plantcalmodulin-binding protein with a kinesin 'heavy chain motor domain. J. Biol. Chem. 271, 7052-7060.

Reddy, A.S.N., Narasimhulu, S.B., Safadi, F., and Golovkin, M. (1996b). A plant kinesin heavy chain-like protein is a calmodulin- binding protein. Plant J. 6, in press.

Romac, J.M.J., and Keene, J.D. (1995). Overexpression of the arginine- rich carboxy-terminal region of U1 snRNP 70K inhibits both splic- ing and nucleocytoplasmic transport of mRNA. Genes Dev. 9, 1400-1410.

Romac, J.M.J., Graff, D.H., and Keene, J.D. (1994). The U1 small nuclear ribonucleoprotein (snRNP) 70K protein is transported in- dependently of U1 snRNP particles via a nuclear localization signal in the RNA-binding domain. MOI. Cell. Biol. 14, 4662-4670.

Rosbash, M., and Séraphin, B. (1991). Whds on first? The U1 snRNP-5' splice site interaction and splicing. Trends Biochem. Sci. 16, 187-190.

Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989). Molecular Clon- ing: A Laboratory Manual, 2nd ed. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory).

Schwarzbauer, J.E., Patel, R.S., Fonda, D., and Hynes, R.O. (1987). Multiple sites of alternative splicing of the rat fibronectin gene tran- script. EMBO J. 6, 2573-2580.

Sharp, P.A. (1994). Split genes and RNA splicing. Cell 77, 805-815. Simpson, G.G., Clark, G.P., Rothnie, H.M., Boelens, W., van

Venrooij, W., and Brown, J.W.S. (1995). Molecular characteriza- tion of spliceosomal proteins U1A and U2B from higher plants. EMBO

Smith, C.W.J., Patton, J.G., and NadaCGinard, B. (1989). Alterna- tive splicing in the control of gene expression. Annu. Rev. Genet. 23, 527-577.

Smith, V., and Barrell, B.G. (1991). Cloning of a yeast U1 snRNP 70K protein homologue: Functional conservation of an RNA-binding do- main between humans and yeast. EMBO J. 10, 2627-2634.

Solymosy, F., and Pollák, T. (1993). Uridylate-rich small nuclear RNAs (UsnRNAs), their genes and pseudogenes, and UsnRNPs in plants: Structure and function. A comparative approach. CRC Crit. Rev. Plant Sci. 12, 275-369.

Spritz, R.A., Strunk, K., Surowy, C.S., Hoch, S.O., Barton, D.E., and Francke, U. (1987). The human U1-70K snRNP protein: cDNA cloning, chromosomal localization, expression, alternative splicing and RNA-binding. Nucleic Acids Res. 15, 10373-10391.

Tamaoki, M., Tsugawa, H., Minami, E., Kayano, T., Yamamoto, N., Kano-Murakami, Y., and Matsuoka, M. (1995). Alternative RNA products from a rice homeobox. Plant J. 7, 927-938.

Tazi, J., Kornstadt, U., Rossi, F., Jeanteur, P., Cathala, G., Brunel, C., and Lührmann, R. (1993). Thiophosphorylation of U1-70K pro- tein inhibits pre-mRNA splicing. Nature 363, 283-286.

Theissen, H., Etzerodt, M., Reuter, R., Schneider, C., Lottspeich, F., Argos, P., Lührmann, R., and Philipson, L. (1986). Cloning of the human cDNA for the U1 RNA-associated 70K protein. EMBO

Tian, M., and Maniatis, T. (1992). Positive control of pre-mRNA splic- ing in vitro. Science 256, 237-240.

van Santen, V.L., and Spritz, R.A. (1987a). Splicing of plant pre- mRNAs in animal systems and vice versa. Gene 56, 253-265.

van Santen, V.L., and Spritz, R.A. (198713). Nucleotide sequence of a bean (Phaseolus vulgaris) U1 small nuclear RNA gene: Implica- tions for plant pre-mRNA splicing. Proc. Natl. Acad. Sci. USA 84,

J. 14, 4540-4550.

J. 5, 3209-3217. A"

9094-9098.

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021

Vaux, P., Guerineau, F., Waugh, R., and Brown, J.W.S. (1992). Char- acterization and expression of U1 snRNA genes from potato. Plant MOI. Biol. 19, 959-971.

Werneke, J.M., Chatfield, J.M., and Ogren, W.L. (1989). Alternative mRNA splicing generates the two ribulosebisphosphate carboxy- laseloxygenase activase polypeptides in spinach and Arabidopsis. Plant Cell 1, 815-825.

Wiebauer, K., Herrero, J.J., and Filipowicz, W. (1987). Nuclear pre- mRNA processing in plants: Distinct modes of 3 splice site selec- tion in plants and animals. MOI. Cell. Biol. 8, 2042-2051.

Woppmann, A., Will, C.L., Kornstadt, U., Zuo, P., Manley, J., and Lührmann, R. (1993). ldentification of an snRNP-associated kinase activity that phosphorylates argininelserine rice domains typical of splicing factors. Nucleic Acids Res. 21, 2815-2822.

Alternative Splicing of U1 snRNP 70K Pre-mRNAs 1435

Wu, J.Y., and Maniatis, T. (1993). Specific interactions between pro- teins implicated in splice site selection and regulated alternative splicing. Cell 75, 1061-1070.

Wu, L., Ueda, T., and Messing, J. (1995). The formation of mRNA 3’-ends in plants. Plant J. 8, 323-329.

Yuo, C.Y., and Weiner, A.M. (1990). Genetic analysis of the role of human U1 snRNA in mRNA splicing: I. Effect of mutations in the highly conserved stem-loop I of U1. Genes Dev. 3, 697-707.

Zamore, P.D., Patton, J.G., and Green, M.R. (1992). Cloning and do- main structure of the mammalian splicing factor UPAF. Nature 355, 609-614.

Zhang, M., Zamore, P.D., Carmo-Fonseca, M., Lamond, A.I., and Green, M.R. (1992). Cloning and intracellular localization of the U2 small nuclear ribonucleoprotein auxiliary factor small subunit. Proc. Natl. Acad. Sci. USA 89, 8769-8773.

Dow

nloaded from https://academ

ic.oup.com/plcell/article/8/8/1421/5985245 by guest on 31 O

ctober 2021