3
Biochimica et Biophysica A cta, 1131(1992) 119-121 119 © 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$05.00 BBAEXP 90353 Short Sequence-Paper cDNA sequence of the pregnancy-specific/31-glycoprotein-lls (PSG-11s) Brigid K. Brophy, Rachel E. MacDonald, Patricia A. McLenachan, Brian C. Mansfield Department of Microbiology and Genetics, Massey University, Palmerston North (New Zealand) (Received 24 February 1992) Key words: Pregnancy-specific/31-glycoprotein-1 Is; cDNA sequence; RGD peptide motif Four cDNA clones representing the human pregnancy-specific /3a-glycoprotein-ll (PSG-11) gene have been characterised. All encoded a splice variant of the PSG-1I gene designated PSG-11s, which can encode a secreted protein of 426 amino acids, containing six potential N-linked glycosylation sites, with a domain structure L-N-AI-AII-BII-C. Minor differences between the four clones sequenced included a restriction site polymorphic for ApaI that may differentiate between alleles of the PSG-11 gene. The human pregnancy-specific /3~-glycoproteins (PSG-11) form a family of glycoproteins which are expressed in the placenta through gestation [1]. Early biochemical studies demonstrated the presence of at least three distinct but immunologically related pro- teins of 54-72 kDa, which had carbohydrate contents ranging from 28-32% [2]. The isolation of cDNA and genomic clones representing the PSG is now revealing the complexity of the PSG family [3]. There are at least 17 transcribed genes clustered on chromosome 19 re- gion q 13.2-13.3 [3,4]. Each gene is composed of at least five exons. The first exons encode the leader sequence (L) and N-terminal domain (N) of the pro- tein. Subsequent exons encode the central protein do- mains designated AI, BI, AII, BII, while the final exon(s) encode the C-terminal domain (C). While the N-terminal and central domains are highly conserved and share over 90% nucleotide identity between genes, the C-terminal domains are of variable length, encod- ing protein domains of 2-26 amino acids, and are of marked sequence diversity [5,12]. The complexity of the protein family is increased further by alternative splic- ing of the transcripts of each gene [12]. The biological roles of the PSG are not known. The high maternal serum levels during normal pregnancy and correlation of reduced serum levels with threat- The sequence of clone hBB5 has been submitted to the EMBL/Genbank Data Libraries under accession number M58591. Correspondence: B.C. Mansfield, Department of Microbiologyand Genetics, Massey University,Palmerston North, New Zealand. ened abortion do, however, imply an important role during pregnancy [6]. Characterisation of the biological activity of the PSG is complicated by the high percent identity between proteins which prevents the biochemi- cal purification of individual proteins. As a step to- wards determining the biological role of individual PSG, we have sought to clone and characterise the genes. We have previously reported the isolation of eight classes of cDNA clones related to the PSG genes [7]. Sequence analysis of representative clones from two of these classes, hL6 and hL19 has now identified them as both encoding a splice variant of the PSG-11 gene, designated PSG-11s [3]. Neither of these clones represented the entire open reading frame of the tran- script, each lacking 5' terminal sequences. Both had, however, been fused via an EcoRI linker used in cloning, to an unrelated cDNA clone [8], leading to the initial assignment as different PSG cDNA. To isolate the full length clone, an oligonucleotide unique to the 3' end of the PSG-11s cDNA (Fig. 1) was used to rescreen the human first trimester placenta cDNA library. Two longer clones, hL6R3 and hBB5 were identified and sequenced. The clone hBB5 contained a complete open reading frame and 71 bp of 5' untrans- lated sequence, while hL6R3 had a slightly truncated open reading frame but longer poly(A) tract. The se- quence of hBB5, PSG-11s (Fig. 1), predicts a 426 amino acid protein containing six potential sites for N-linked glycosylation. However, the protein domain structure differs to that predicted for another reported PSG-11 splice variant, PSG-11w [10]. For PSG-11s, a leader sequence (L) is followed by an N, AI, AII, BII and C S domain.

cDNA sequence of the pregnancy-specific β1-glycoprotein-11s (PSG-11s)

Embed Size (px)

Citation preview

Page 1: cDNA sequence of the pregnancy-specific β1-glycoprotein-11s (PSG-11s)

Biochimica et Biophysica A cta, 1131 (1992) 119-121 119 © 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$05.00

BBAEXP 90353 Short Sequence-Paper

cDNA sequence of the pregnancy-specific/31-glycoprotein-lls (PSG-11s)

Brigid K. Brophy, Rachel E. MacDonald, Patricia A. McLenachan, Brian C. Mansfield Department of Microbiology and Genetics, Massey University, Palmerston North (New Zealand)

(Received 24 February 1992)

Key words: Pregnancy-specific/31-glycoprotein-1 Is; cDNA sequence; RGD peptide motif

Four cDNA clones representing the human pregnancy-specific /3a-glycoprotein-ll (PSG-11) gene have been characterised. All encoded a splice variant of the PSG-1I gene designated PSG-11s, which can encode a secreted protein of 426 amino acids, containing six potential N-linked glycosylation sites, with a domain structure L-N-AI-AII-BII-C. Minor differences between the four clones sequenced included a restriction site polymorphic for ApaI that may differentiate between alleles of the PSG-11 gene.

The human pregnancy-specific /3~-glycoproteins (PSG-11) form a family of glycoproteins which are expressed in the placenta through gestation [1]. Early biochemical studies demonstrated the presence of at least three distinct but immunologically related pro- teins of 54-72 kDa, which had carbohydrate contents ranging from 28-32% [2]. The isolation of cDNA and genomic clones representing the PSG is now revealing the complexity of the PSG family [3]. There are at least 17 transcribed genes clustered on chromosome 19 re- gion q 13.2-13.3 [3,4]. Each gene is composed of at least five exons. The first exons encode the leader sequence (L) and N-terminal domain (N) of the pro- tein. Subsequent exons encode the central protein do- mains designated AI, BI, AII, BII, while the final exon(s) encode the C-terminal domain (C). While the N-terminal and central domains are highly conserved and share over 90% nucleotide identity between genes, the C-terminal domains are of variable length, encod- ing protein domains of 2-26 amino acids, and are of marked sequence diversity [5,12]. The complexity of the protein family is increased further by alternative splic- ing of the transcripts of each gene [12].

The biological roles of the PSG are not known. The high maternal serum levels during normal pregnancy and correlation of reduced serum levels with threat-

The sequence of clone hBB5 has been submitted to the EMBL/Genbank Data Libraries under accession number M58591.

Correspondence: B.C. Mansfield, Department of Microbiology and Genetics, Massey University, Palmerston North, New Zealand.

ened abortion do, however, imply an important role during pregnancy [6]. Characterisation of the biological activity of the PSG is complicated by the high percent identity between proteins which prevents the biochemi- cal purification of individual proteins. As a step to- wards determining the biological role of individual PSG, we have sought to clone and characterise the genes. We have previously reported the isolation of eight classes of cDNA clones related to the PSG genes [7]. Sequence analysis of representative clones from two of these classes, hL6 and hL19 has now identified them as both encoding a splice variant of the PSG-11 gene, designated PSG-11s [3]. Neither of these clones represented the entire open reading frame of the tran- script, each lacking 5' terminal sequences. Both had, however, been fused via an E c o R I linker used in cloning, to an unrelated cDNA clone [8], leading to the initial assignment as different PSG cDNA. To isolate the full length clone, an oligonucleotide unique to the 3' end of the PSG-11s cDNA (Fig. 1) was used to rescreen the human first trimester placenta cDNA library. Two longer clones, hL6R3 and hBB5 were identified and sequenced. The clone hBB5 contained a complete open reading frame and 71 bp of 5' untrans- lated sequence, while hL6R3 had a slightly truncated open reading frame but longer poly(A) tract. The se- quence of hBB5, PSG-11s (Fig. 1), predicts a 426 amino acid protein containing six potential sites for N-linked glycosylation. However, the protein domain structure differs to that predicted for another reported PSG-11 splice variant, PSG-11w [10]. For PSG-11s, a leader sequence (L) is followed by an N, AI, AII, BII and C S domain.

Page 2: cDNA sequence of the pregnancy-specific β1-glycoprotein-11s (PSG-11s)

120

i01

201

301

401

__ Signal

I CAGCCGTGCTCAGACAGCTTCTGGATCCTAGGCTCATCTCCACAGAGGAGAACACGCAGGCAGCAGAGACCATGGGGCC CTTCCCAGC CCCTTCCTGCAC

M G P F P A P S C T

N-Domain

I L6R3 I ACAGCGCATCACCTGGAAGGGGCTCCTGCTCACAGCATCACTTTTAAACTTCTGGAACCCGCCCACCACTGCCGAAGTCACGATTGAAGC CCAGC CACCC

Q R I T W K G L L L T A S L L N F W N P P T T A E V T I E A Q P P

I "~" L6R l - - LI9

AAAGTTTCTGAGGGGAAGGATGTTC TTCTAC TTGTCCACAATTTGC C C CAGAATC TTC C TGGC TACTTCTGGTACAAAGGGGAAATGAC GGAC C TC TA C C

K V S E G K D V L L L V H N L P Q N L P G Y F W Y K G E M T D L Y H

ATTACA~ATATCGTATATAG~A~T~T~ATATA~CC~CATACAG~GAG~CAGTATATTCC~CGCATCCCTGC~ATCCA

Y I I S Y I V D G K I I I Y G P A Y S G R E T V Y S N A S L L I Q

G~TCACCC~A~CAGG~CCTACACC~ACACATCATAAAGCGA~A~AGACTAGAG~GAAAT~GACATTTCA~CTTCACCTTATAC

N V T R K D A G T Y T L H I I K R G D E T R E E I R H F T F T L Y

--AIDomain

501 I TTGGA~ACTCCC~GCCCTACATCTCCAGCAGC~C~AAACCCCA~GAGGCCATGGAGGC~CGC~TC~ATCC~AGACTC~GACGC~

L E T P K P Y I S S S N L N P R E A M E A V R L I C D P E T L D A S

601

701

801

901

i001

ii01

1201

1301

1401

GCTAC C TATGGTGGATGAATC43TCAGAGC C TC CCTGTGAC TCACAGGTTGCAGCTGTC CAAAAC CAACAGGAC C C TC TATC TATTTGGTGTCACAAAGTA

Y L W W M N G Q S L P V T H R L Q L S K T N R T L Y L F G V T K Y

__ AII Domain

G (L6R) T I TATTGCAGGACCCTATGAATGTGAAATACGGAACCCAGTGAGTGCCAGTCGCAGTGACCCAGTCACCCTGAATCTCCTCCCGAAGCTGCCCATCC CCTAC

I A G P Y E C E I R N P V S A S R S D P V T L N L L P K L P I P Y

I (BB5)

ATCACCATC~C~C~CCCCAGGGAG~T~GGATGTC~AGCC~CACC~CCT~GAG~AG~CTACACCTA~ATTTGGTGGCT~CG

I T I N N L N P R E N K D V L A F T C E P K S E N Y T Y I W W L N G

GTCAGAGCCTCCCCGTCAGTCCCGGGGTAAAGCGACCCATTGAAAACAGGATACTCATTCTA•CCAGTGTCACGAGAAATGAAACAGGACCCTATCAATG

Q S L P V S P G V K R P I E N R I L I L P S V T R N E T G P Y Q C

-- BII Domain

A. (L6R3) I.

TGAAATACGGGACCGATATGGTGGCCTCCGCAGTAACCCAGTCATCCTAAAT•TCCTCTATGGTCCAGACCTCCCCAGAATTTACCCTTCATTCACCTAT

E I R D R Y G G L R S N P V I L N V L Y G P D L P R I Y P S F T Y

Q

TACCG~CA~AG~CCTCGACTTGTCC~C~CA~GG~TCT~CCCA~C~CAGAGTATTTTTGGAC~G~GTTT~AGC~TCAGGAC

Y R S G E N L D L S C F T E S N P P A E Y F W T I N G K F Q Q S G Q

AAAAG~TCTTTATCCCCCAAA~ACTAGAAATCATAGCG~CTCTA~CTTGCTC~TTCAT~CTCAGCCACTGGC~GGAAATCTCCAAATCCATGAC

K L F I P Q I T R N H S G L Y A C S V H N S A T G K E I S K S M T

__ C Do.in

I

AGTCAAAG~TCTGGTCCC~CCA~GAGACC~ACAGAGTCTCAGTCA~AC~C~C~C~AGACAC~AGAAAAAG~CAGGCTGATACCTTCAT--~G

V K V S G P C H G D L T E S Q S *

L6R I AAA~C~GACAAAG~GAAAAAAACTC~ATTGGACTAAAT~TCAAAA~AT~TTTTCAT~TTTT~ATTGGAAAA~CTGATTCTTT

i00

200

300

400

500

600

700

800

900

i000

ii00

1200

1300

1400

1500

1601 . 1678

TGAA TGC C CCAGAATTGTGAAAC TA TTCA TGAGTA TTCATAGG TTTATGG TAATAAAG TTA TTTGCAC ATGTTC CG T ( A )

LI9, L6R3, BB5 I

Fig. 1. Nucleotide and predicted amino acid sequence of the cDNA clone hBB5 encoding PSG-11s. Variations in the nucleotide and predicted amino acid sequence found in three other cDNA clones, hL6R3, hL6, hL19 are shown above and below the sequence respectively. The ends of the three incomplete cDNA clones are bracketed (/ ' ) . The oligonucleotide sequence used to isolate hBB5 and poly(A) addition consensus

sequences (AATAAA) are underlined. Potential N-linked glycosylation sites (consensus N X S / T ) and the RGD motif are underlined in bold.

GAATGTTTTATTCTCCAGATTTATGAACTTTTTTTCTTCAGCAATTGGTAAAGTATA•TTTTGTAAACAAAAATTGAAATATTTGCTTTTGCTGT•TATC

1501 1600

Page 3: cDNA sequence of the pregnancy-specific β1-glycoprotein-11s (PSG-11s)

For PSG-11w the structure is L, N, AI, BII, C w. Within the N domain an RGD peptide motif, charac- teristic of many adhesive proteins in the extracellular matrix (reviewed in Ref. 9), is flanked by a markedly acidic sequence, ETREE. In the integrin receptor fam- ily, the RGD motif has been shown to form an essen- tial part of the recognition site and the acidic environ- ment of this motif in PSG-11s would suggest that PSG-1 ls may also participate in prote in-prote in inter- actions.

The sequence of PSG-11s is identical to that of PSG-11w for the L, N, AI and BII domains, but the C-domains differ markedly. In PSG-11s an 11 amino acid sequence, similar in length to most other reported PSG C-domains, is predicted. In PSG-11w, alternative splicing selects a C-domain exon predicted to encode 80 amino acids. The length of the 3' untranslated regions also differ. In PSG-11w this extends 658 bp and contains a weak poly(A) addition consensus sequence, ATTAAA, 14 bp upstream of the poly(A) addition site. For three of the four PSG-11s cDNA isolates, the 3' untranslated region extends 335 bp with a poly(A) addition consensus sequence 28 bp from the poly(A) tract. In one isolate, hL6R, the region is truncated to 80 bp with a poor poly(A) addition sequence, AT- GAAA, 25 bp from the poly(A) tract. This may reflect alternative poly(A) addition sites or an internal prim- ing by the oligo(dT) primer during cDNA synthesis.

The DNA sequences of the four PSG-11s isolates differ at just three positions from each other. At posi- tion 710, a transversion in the hL6R sequence creates a restriction polymorphism for ApaI, but does not affect the predicted protein sequence. A transversion at posi- tion 1009 in hL6R3 creates a change in the predicted sequence of the AI protein domain identical to that seen in the closely related PS/3GA [11]. While the sequence of clone hBB5 stands apart from the other three clones at position 758, it does match the se- quence of PSG-11w, suggesting that at least some of

121

the differences observed may reflect allelic variation in the PSG-11 gene.

We are currently expressing PSG-11s to obtain suffi- cient protein for studies of potential biological activi- ties. A genomic clone encoding PSG-11 has also been isolated. Part of this sequence (Genbank accession number M69025) demonstrates that a splice donor site at the end of the BII domain is used to splice to a separate C~ exon in PSG-11s, but is read through and not utilised in PSG-I lw.

This research was supported in part by a project grant from the New Zealand Medical Research Coun- cil.

References

10ikawa, S., Inuzuka, C., Kuroki, M., Matsuoka, Y., Kosaki, G. and Nakazato, H. (1989) Biochem. Biophys. Res. Commun. 163, 1021-1031.

2 Watanabe, S. and Chou, J.Y. (1988) J. Biol. Chem. 263, 2049- 2054.

3 Barnett, T. and Zimmerman, W. (1990) Tumor Biol. 11, 59-63. 4 Thompson, J., Koumari, R., Wagner, K., Barnet, S., Schleussner,

C., Schrewe, H., Zimmerman, W., Muller, G., Schemp, W., Zaninetta, D., Ammaturo, D. and Hardman, N. (1990) Biochem. Biophys. Res. Commun. 167, 848-859.

5 Thompson, J.A., Mauch, E-M., Chen, F-S., Hinoda, Y., Schrewe, H., Berling, B., Barnert, S., Von Kleist, S., Shiveley, J.E. and Zimmerman, W. (1989) Biochem. Biophys. Res. Commun. 158, 996-1004.

6 Bischof, P. (1984) Contrib. Gyn. 12, 1-96. 7 McLenachan, T. and Mansfield, B. (1989) Biochem. Biophys.

Res. Commun. 162, 1486-1493. 8 Hale, T.K. and Mansfield, B.C. (1989) Nucleic Acid. Res. 17,

10112. 9 Springer, T.A. (1990) Nature 346, 425-434.

10 Zheng, Q-X., Tease, L.A., Shupert, W.L. and Chart, W-Y. (1990) Biochemistry 29, 2845-2852.

11 Streydio, C., Swillens, S., Georges, M., Szpirer, C. and Vassart, G. (1990) Genomics 6, 579-592.

12 Streydio, C., Lacka, K., Swillens, S. and Vassart, G. (1988) Biochem. Biophys. Res. Commun. 154, 130-137.