9
JOURNAL OF VIROLOGY, May 1990, p. 2073-2081 0022-538X/90/052073-09$02.00/0 Copyright © 1990, American Society for Microbiology Vol. 64, No. 5 Multigene Families in African Swine Fever Virus: Family 360 ANTONIO GONZALEZ,t VICTOR CALVO,t FERNANDO ALMAZAN, JOSE M. ALMENDRAL, JUAN CARLOS RAMIREZ, INMACULADA DE LA VEGA, RAFAEL BLASCO,§ AND ELADIO VINUELA* Centro de Biologia Molecular (Consejo Superior de Investigaciones Cientificas-UAM), Facultad de Ciencias, Universidad Aut6noma, Canto Blanco, 28049 Madrid, Spain Received 15 September 1989/Accepted 2 January 1990 A group of cross-hybridizing DNA segments contained within the restriction fragments RK', RL, RJ, and RD' of African swine fever virus DNA were mapped and sequenced. Analysis of these sequences revealed the presence of a family of homologous open reading frames in regions close to the DNA ends. The whole family is composed of six open reading frames with an average length of 360 coding triplets (multigene family 360), four of which are located in the left part of the genome and two of which are in the right terminal EcoRI fragment. In dose proximity to the right terminal inverted repeat, we found an additional small open reading frame which was homologous to the 5'-terminal portion of the other open reading frames, suggesting that most of that open reading frame has been deleted. These repeated sequences account for the previously described inverted internal repetitions (J. M. Sogo, J. M. Almendral, A. Talavera, and E. Vinuela, Virology 133:271- 275, 1984). Most of the genes of multigene family 360 are transcribed in African swine fever virus-infected cells. A comparison of the predicted protein sequences of family 360 indicated that several residues are conserved, suggesting that an overall structure is maintained for every member of the family. The transcription direction of each open reading frame, as well as the evolutionary relationships among the genes, suggests that the family originated by gene duplication and translocation of sequences between the DNA ends. African swine fever (ASF) virus DNA is a double-stranded molecule of about 170 kilobase pairs. This molecule shares several structural features with the DNA of poxviruses, such as the presence of hairpin loop structures at the DNA ends (13) and terminal inverted repetitions (TIR) (2). Sogo et al. (26) have described the presence of internal inverted repetitions which consist of about 0.13 kilobases (kb) of sequence located 2.5 kb from the left TIR and 0.4 kb from the right TIR in the terminal fragments RK' and RD'. ASF virus DNA can vary as much as 20 kb in length by means of deletions or additions in regions located close to the DNA ends (4). In the accompanying paper (1), we described both a number of cross-hybridizations between ASF virus DNA restriction fragments and the DNA sequence of a multigene family (family 110) located in several small EcoRI fragments. In this paper, we describe a second group of cross-hybrid- izing fragments which includes restriction fragments RK', RL, RJ, and RD'. The DNA sequences obtained from these fragments revealed the existence in ASF virus DNA of a multigene family (family 360) which is not homologous to multigene family 110. MATERIALS AND METHODS Cells, viruses, and recombinant clones. The Vero cell line (CCL 81) was obtained from the American Type Culture Collection (Rockville, Md.). BA71V strain of ASF virus, derived from the BA71 field isolate by adaptation to Vero cell cultures, has been previously described (7). Recombi- * Corresponding author. t Present address: Instituto de Parasitologia "L6pez Neyra," CSIC c/ Ventanilla, 11, 18001 Granada, Spain. t Present address: Department of Pediatric Oncology, Dana- Farber Cancer Institute, Boston, MA 02115. § Present address: Laboratory of Viral Diseases, National Insti- tute of Allergy and Infectious Diseases, Bethesda, MD 20892. nant clones containing restriction fragments of BA71V DNA (15) were used. Oligonucleotide synthesis. Oligodeoxynucleotides were synthesized by the phosphoramidite method in an Applied Biosystems 381A DNA synthesizer. The oligonucleotides were subsequently subjected to polyacrylamide gel electro- phoresis, and the bands corresponding to full-size products were excised and eluted. DNA sequencing. DNA sequencing was performed by either the chemical degradation procedure (17) or the chain termination procedure (24). Nucleotide sequences were ob- tained from both strands of ASF virus DNA restriction fragments cloned in pUC plasmids or M13mp phages (18, 19). Northern (RNA) blotting. RNA from Vero cells infected with the BA71 strain of ASF virus was purified and probed with 5'-end-labeled oligonucleotides as described previously (1). The oligonucleotides were specific for genes K'360 (GA CTTATACTTTT-CTTCATCTAGTAAGGCG), K'362 (ATC TCGCCCACCGTATTATTTTCGGACACA), L356 (ACAA TGATAGAATAACCGTATATATCTGCT), J319 (ACTCTC GATGAATCTCCTCTCCCATTTCCT), D'311 (CAAGGTA TCTGAAAGATGTATAAGAGCATC), and D'363 (CCCTT TTGTTT-CGCCAGGGTCTTACCCTCT). Computer analysis. Compilation and routine analysis of DNA sequences were performed with the software package of the University of Wisconsin genetics computer group (5) in a DEC VAX 730 computer running under the VMS operating system. The programs COMPARE, GAP, and BESTFIT were used to carry out dot matrix comparisons (16), global alignments of amino acid sequences (21), and alignment of homologous DNA sequences (25). Multiple alignments were carried out by the progressive alignment method of Feng and Doolittle (10). Searches of the protein data bank of the National Biomedical Research Foundation were done with the WORDSEARCH program of the Uni- versity of Wisconsin genetics computer group package (29) and the FASTA program (22). 2073

Multigene Familiesin African Swine Fever Virus: Family 360

Embed Size (px)

Citation preview

Page 1: Multigene Familiesin African Swine Fever Virus: Family 360

JOURNAL OF VIROLOGY, May 1990, p. 2073-20810022-538X/90/052073-09$02.00/0Copyright © 1990, American Society for Microbiology

Vol. 64, No. 5

Multigene Families in African Swine Fever Virus: Family 360ANTONIO GONZALEZ,t VICTOR CALVO,t FERNANDO ALMAZAN, JOSE M. ALMENDRAL,

JUAN CARLOS RAMIREZ, INMACULADA DE LA VEGA, RAFAEL BLASCO,§ AND ELADIO VINUELA*

Centro de Biologia Molecular (Consejo Superior de Investigaciones Cientificas-UAM), Facultad de Ciencias,Universidad Aut6noma, Canto Blanco, 28049 Madrid, Spain

Received 15 September 1989/Accepted 2 January 1990

A group of cross-hybridizing DNA segments contained within the restriction fragments RK', RL, RJ, andRD' of African swine fever virus DNA were mapped and sequenced. Analysis of these sequences revealed thepresence of a family of homologous open reading frames in regions close to the DNA ends. The whole familyis composed of six open reading frames with an average length of 360 coding triplets (multigene family 360),four of which are located in the left part of the genome and two of which are in the right terminal EcoRIfragment. In dose proximity to the right terminal inverted repeat, we found an additional small open readingframe which was homologous to the 5'-terminal portion of the other open reading frames, suggesting that mostof that open reading frame has been deleted. These repeated sequences account for the previously describedinverted internal repetitions (J. M. Sogo, J. M. Almendral, A. Talavera, and E. Vinuela, Virology 133:271-275, 1984). Most of the genes of multigene family 360 are transcribed in African swine fever virus-infected cells.A comparison of the predicted protein sequences of family 360 indicated that several residues are conserved,suggesting that an overall structure is maintained for every member of the family. The transcription directionof each open reading frame, as well as the evolutionary relationships among the genes, suggests that the familyoriginated by gene duplication and translocation of sequences between the DNA ends.

African swine fever (ASF) virus DNA is a double-strandedmolecule of about 170 kilobase pairs. This molecule sharesseveral structural features with the DNA of poxviruses, suchas the presence of hairpin loop structures at the DNA ends(13) and terminal inverted repetitions (TIR) (2).Sogo et al. (26) have described the presence of internal

inverted repetitions which consist of about 0.13 kilobases(kb) of sequence located 2.5 kb from the left TIR and 0.4 kbfrom the right TIR in the terminal fragments RK' and RD'.ASF virus DNA can vary as much as 20 kb in length bymeans of deletions or additions in regions located close tothe DNA ends (4).

In the accompanying paper (1), we described both anumber of cross-hybridizations between ASF virus DNArestriction fragments and the DNA sequence of a multigenefamily (family 110) located in several small EcoRI fragments.In this paper, we describe a second group of cross-hybrid-izing fragments which includes restriction fragments RK',RL, RJ, and RD'. The DNA sequences obtained from thesefragments revealed the existence in ASF virus DNA of amultigene family (family 360) which is not homologous tomultigene family 110.

MATERIALS AND METHODS

Cells, viruses, and recombinant clones. The Vero cell line(CCL 81) was obtained from the American Type CultureCollection (Rockville, Md.). BA71V strain of ASF virus,derived from the BA71 field isolate by adaptation to Verocell cultures, has been previously described (7). Recombi-

* Corresponding author.t Present address: Instituto de Parasitologia "L6pez Neyra,"

CSIC c/ Ventanilla, 11, 18001 Granada, Spain.t Present address: Department of Pediatric Oncology, Dana-

Farber Cancer Institute, Boston, MA 02115.§ Present address: Laboratory of Viral Diseases, National Insti-

tute of Allergy and Infectious Diseases, Bethesda, MD 20892.

nant clones containing restriction fragments of BA71V DNA(15) were used.

Oligonucleotide synthesis. Oligodeoxynucleotides weresynthesized by the phosphoramidite method in an AppliedBiosystems 381A DNA synthesizer. The oligonucleotideswere subsequently subjected to polyacrylamide gel electro-phoresis, and the bands corresponding to full-size productswere excised and eluted.DNA sequencing. DNA sequencing was performed by

either the chemical degradation procedure (17) or the chaintermination procedure (24). Nucleotide sequences were ob-tained from both strands of ASF virus DNA restrictionfragments cloned in pUC plasmids or M13mp phages (18,19).Northern (RNA) blotting. RNA from Vero cells infected

with the BA71 strain of ASF virus was purified and probedwith 5'-end-labeled oligonucleotides as described previously(1). The oligonucleotides were specific for genes K'360 (GACTTATACTTTT-CTTCATCTAGTAAGGCG), K'362 (ATCTCGCCCACCGTATTATTTTCGGACACA), L356 (ACAATGATAGAATAACCGTATATATCTGCT), J319 (ACTCTCGATGAATCTCCTCTCCCATTTCCT), D'311 (CAAGGTATCTGAAAGATGTATAAGAGCATC), and D'363 (CCCTTTTGTTT-CGCCAGGGTCTTACCCTCT).Computer analysis. Compilation and routine analysis of

DNA sequences were performed with the software packageof the University of Wisconsin genetics computer group (5)in a DEC VAX 730 computer running under the VMSoperating system. The programs COMPARE, GAP, andBESTFIT were used to carry out dot matrix comparisons(16), global alignments of amino acid sequences (21), andalignment of homologous DNA sequences (25). Multiplealignments were carried out by the progressive alignmentmethod of Feng and Doolittle (10). Searches of the proteindata bank of the National Biomedical Research Foundationwere done with the WORDSEARCH program of the Uni-versity of Wisconsin genetics computer group package (29)and the FASTA program (22).

2073

Page 2: Multigene Familiesin African Swine Fever Virus: Family 360

2074 GONZALEZ ET AL.

0 10

TIR K L UX'VYZUXTI I I la II .1

160

J 7/I

170 kb

DI TIR

K'360 K'362 L356 J319 D'311I .6 D'D' 363 D'42

FIG. 1. Arrangement of multigene family 360 in ASF virus DNA. Only the terminal regions of the ASF virus genome are shown. TheEcoRI restriction fragments are named as described previously (2). Regions containing cross-hybridizing sequences which were fullysequenced are expanded. Positions of different ORFs in the sequences are indicated (D).

Genealogic tree construction. A genealogic tree based onthe multiple alignment of amino acid sequences was derivedby the method of Feng and Doolittle (10). This and alterna-tive tree topologies were tested by both distance matrixprograms (12) and the protein parsimony method (6, 11) withthe programs of the Phyllip phylogeny inference package (9).

RESULTS

Mapping of cross-hybridizing DNA sequences. Hybridiza-tion studies with cloned ASF virus DNA fragments revealeda complex pattern of repeated sequences in the proximitiesof the TIR at both ends of the ASF virus genome (1). Onegroup of cross-hybridizing fragments included the restrictionfragments RK', RL, RJ, and RD'.To determine the precise location of these repeated se-

quences within the fragments, hybridization experimentswere carried out with subfragments RJ and RD' (data notshown), and results indicated that the repeated sequence infragment RJ (5.3 kb) was located in the rightmost 1.5 kb ofthe fragment. Repeated sequences in fragment RD' (10.7 kb)were found in the leftmost 1 kb of the fragment and in the 2.5kb immediately adjacent to the TIR.

Sequencing of cross-hybridizing segments. To determinethe nucleotide sequence of the repeated sequences, restric-tion fragments RK', RL, RJ, and RD' were sequenced in theregions indicated in Fig. 1. The total length of the sequencedetermined was 9,540 base pairs. This sequence was distrib-uted in four noncontiguous stretches (Fig. 2). The G+Ccontent was 32%, significantly lower than the 41% reportedfor ASF virus DNA (7). DNA sequences obtained from thecentral part of the genome of ASF virus have a higher G+Ccontent than the terminal regions (C. Sim6n-Mateo, personalcommunication).

Multigene family 360. A search for open reading frames(ORFs) showed the presence of six ORFs with lengthsgreater than 300 codons. Figure 1 shows the locations ofthese ORFs in the restriction map. Each ORF was namedaccording to the restriction fragment in which it was con-tained and the length of the ORF in coding triplets.Four ORFs were located in the left part of the genome,

three of them (K'360, K'362, and L356) in the 5 kb adjacentto the left TIR (Fig. 2a). The right terminal EcoRI fragment(RD') contained two more ORFs (D'311 and D'363) and asmall one (D'42) which is discussed below. ORFs located in

the left part of the ASF virus restriction map were tran-scribed to the left, whereas ORFs located in the right part ofthe restriction map were transcribed to the right.An additional ORF (177 codons in length) was detected in

the region between K'362 and L356 and will be describedelsewhere (A. Camacho and E. Vifiuela, unpublished re-sults).Homology between ORFs. Figure 3 shows the result of a

dot matrix analysis (16). A number of diagonal lines areevident, showing that these ORFs were homologous. ORFD'42 was clearly homologous to the 5' ends of the sixremaining ORFs. The regions of similarity defined by thediagonal lines in the dot matrix were aligned by using thealgorithm of Smith and Waterman (25). The degree ofsimilarity varied from 52 to 81% (approximately). The per-centage of similarity observed is in good agreement with thehybridization results (with the exception of the results fromfragments RL and RD', which did not hybridize understringent conditions), despite the 70.4% similarity betweenL356 and D'363. The observed internal inverted repetitionspreviously detected by electron microscopy at 2.5 and 0.4 kbfrom the left and right TIR, respectively (26), are in agree-ment with the positions of the reading frames K'362 andD'42. The high similarity of this repetition (81.3%) probablyfavored the appearance of the derived electron microscopystructures.

Transcription of family 360. To determine whether all ofthe genes are expressed and, if so, the time of expressionduring infection, probes specific for each gene were requiredto avoid cross-hybridization with homologous transcripts.For this purpose, 30-nucleotide-long sequences which wereunique for each gene were chosen, and the correspondingoligonucleotides were synthesized and used to probe North-ern blots containing RNA from ASF virus-infected cells.Transcripts for genes K'360, K'362, J319, D'311, and D'363were present at early times in infected cells (Fig. 4). Some ofthe probes hybridized to multiple species of RNA, showingthe existence of heterogeneity of the transcripts. The lengthsof the transcripts (1.4 to 1.5 kb) are in good agreement withthe lengths of the ORFs (0.9 to 1.1 kb). We were not able todetect an RNA for gene L356. Transcripts for genes K'360,K'362, and D'363 were also present at late times, althoughwith sizes different from those of the early transcripts.Comparison of putative protein products. The predicted

J. VIROL.

1- /e14

,l-

Page 3: Multigene Familiesin African Swine Fever Virus: Family 360

VOL. 64, 1990 ASF VIRUS MULTIGENE FAMILY 360 2075

ayOTTCCATOATAGATAGTCATCrAATTArrAACACATTTCArATTwTACAGTCACTrC',CG,CGGTCAATCTC CCGATAATTTAATTTTTTArAAC 12 0

L356 M Q P S T L Q A L A K R A L A T Q H V S K D D Y Y I L E R C G L W U H E AAAAATAAACAT2CAGCCAT4CGACTTACAAGCACGTAAAAGGGCATTGGCCACGCAACACGTATCTAAAGATGATTATrATATrAT0GAATGGTGCATCAAGCT240

P r s I Y I D D D N Q I M I R T L C F K E G I K L N T A L V L A V K E N N E D LCCTATCTCAATT3ATATAGATGATGATAATCAAATAA0ATAAGGACATTATGCTTTAAAGAGGGTATAAAGCTTAATACTCCATTGGTATTGGAGTrAAGGAAAACAATCAAGATCTA360

I M L F T E W G A N I N Y G L L F I N N E H T R N L C R K L G A K E E L E T S EATCATGTTGTlITACTrGAATGGGGCGCAAATATrAATITATroG ACTTA6-l ATTAATAACGAGCATACTCGAAACCTATGCCGAAAAII GCTAAAGAAGAGCTTGAGACAAGTGAA 480

I L R F F F E T K C K I T S S N V I L C H E L F S N N P F L Q N V N M V D L R MATTTTAC AlCSSACAAAGTIGTAAAATAACAAGTAGTAATGTCAl lCAGTATTTATAACCCTr=ACAAAATGCTAAACATGTGATAAGGATIG 600

I I Y W E L K D L P T N S M L N E I S F S E M L T K Y W Y G I A V K Y N L K E AATTATTATTG;GGAGTlAAGGATTACCAACAAATTCATGTITAAATGAGATCTCATIAGTGAGAT'GCTAACCAAATATTrGGTATrGGCATAGCGGTAAAATATAATCTTAAAGAAGCT 72 0

I Q Y F C Q E Y R H F D E W R L I C A L S F N N V F D L H E I C N T T K I H M SATTCAATATICTGTCAAGAATACAGGCATrTTATGAATGGCGAl'TAMTTT,GTGCA mlACAATGTGT= ACCTTCATGAAATATTACAAACGAAAATrCATATGAGT 84 0

I N K M M E L A C M R D N N F L T I Y Y C F A L G A N A N R A M L I S V K N F CATTAATAAAATGATGGAGCTG;GCCT'GTATCGCGCGACAATAAT77 ACCATTACTA ACTGCTAATCGGAGCCAATGAATC TCG(TAAACTTGT 960

I E N M F F C M D L G A N V I E H S K T L A D I Y G Y S I I V N I L S L K I Y KASTGAAMATAITTi-iSlll-i-ATGGATTAGGGGCTAATGTrATTAACACAGTAAGACATrAGCAGATATATACGGTTATTCTATCATTCG=A 108 0

A N P I L L S K E T N P E K I N T L L K N Y Y S K N M L A Y D I C C I D N YGCGAATCCTATCTTATITATCAAAAGAAACTAATCCTGAAAAGATTAATACTTACTAAAAAACTATTATTCGAAAAATATGTTAGCTIAMTATTG _TGTA ATTTA 1200

AATAGTrAtUmGlSTAATTATCTTAAGATAATTGAGATATAATAAAAATTATATCTAlwmATUTT1AAGACGTTATCAACAATATCGATAGATAMC_1320

MAAAACGTGTAAAAATAT1TTGATTGTATATCIArIATTAAA7TCCATAATAGATATGGATGATACTCT rCCTAAACAGATGACTCCAACAGACACTTCCCCGTTAAAGGA 1440

GGAGCAGGCTCATIT;CAACAATAAAACATTGGAAAATCAGCCTAAATMAATAATAAGACTGATITCACAAAATACCGATTACAGAATACTGAACCATCAAAAGTATAGTT 1560

TATrGATACTrxAATAAAGATTACC-=-AAGCGGTAAATTATATAACAAC CATT=TAAATATCATATITAAAACTITAAAGTG'TTACCAAACATGATTATCGGACACTTGAAAAATAA 1680

lGATGAGCTAGAACGCTG-ATAACAAAAATITATCAAGATITATAAAGCTGAGCCTGATAAAACAAGCG AIAGATCrATAG CGTAAAAAATATG_ AATG CATAAAAA 180 0

TTATTCAACATTrACATCTGAATGGTATATTAATGAAAGAAAATATAATGrATGTTCCAGAAGGACCAAAAAAGCAGTTGTGCATCGATGCACAATATTATAAGATATAAACCGATCAAA 1920

ACGAGGACTAITT=-AGATGAT'TATAAGAAATAGTmATAATTACTTTAAGAAAGATAAAGTCAGGCATTAT 2040

TTATCATGTTATA-lAGTAAGCTAGATGGATCTTACCATATTAAGAACCGGGTGATGTGGATITATTATATACATAACAT~GTCCCTCCTTTCATCACCAACATATTC 2160

CCAACCTGTACAGTCATCGCTrCCCATGAACAATATmT=CACACAICTCTTGGGGCTATrGGGATITCACTTGAGAT'GGAGTTATT TTATAGGATGC_vL1%CAAATTT 2 280

CTroCTATCGGCAGCAGTATICCGTAAAACGGTAAAAATTGGGAGTIGAATTCACATCAGATTAATTTTATA=CTTT=CACAGCGTCTAAGCAGCTCMA 240 0IllCTGCATGTTCCACGAACACAA5S,CTCTCCACTACCACAATCTTATCTACTTTACAGACClIGmI--Vlr---AATTMAAG=2 520

TAGTATAATAAGAGCAATAAGCAAAGAI AT ATATTAAC AGGGTTCATITTATAGGTCTACATAATATT AA 2640

K 362 M S T P L S L Q A L A K KTALIACMTGT AACCTAAATTCATCCGAA TTAAAACATTAAAACTTAAACCATr TAAACAAAATGTCTACTCCAclTCI AGGCTCTTCTAAAA 2 760

I L A T Q H I S K N H Y F I L K Y C G L W W H G A P I M F S T N E D N Q L M I KAAATACTGGCCACACAGCACATATCCAAAAATCACTA AAATATIsTliA TGI AGCATGGCTCCAAITATG;lrACTAATGAGGATAATCAATCTGATGATAA 2 88 0

S A I F K D G L E L N L A L M K A V Q E N N Y D L I E L F T E N G A D I N S S LAATCAGCAATTTAAAGATGGTTTAGAGTTAAATCTCGCAITAATGAAAGCTGTGCAGGAAAACAACTATGATCTAATAGAGT rl-ll-ACCGAAMGCAGACATCAAc-t-AGCT 300 0

V T V N T E H T H N F C R E L G A K I L N E M D I V Q I F Y K I H R I K T S S NTAGTCACIAATCGGAGCATACTTGGAATTCTCCGGGAGTTAGGCGCAAAAATTTTGAATIGAAATGGATATTG ACAATTTATAAAATCGTATrAAAACTAGTAGTA 312 0

I I L C H K L L S N N P L F Q N I E E L K I I I C C F L E K I S I N F I L N E IAATTCATICCsATAAA ITTAT TCCCTTTTCCAGAATATAGAGGAATAAATAT7TTTAGGAAGATATCGATCAACTTATATTGAATGAAA 3 24 0

T L N E M L A R L N Y S M A V R Y H L T E A I Q Y F Y Q R Y R H F K D N R L I CTAACATTGAACGAAATGCTAGCTGTAGTAATTGGGAGTTACCTAACTGAAGCTATCC AATTACAGTTGCTTAGTGCGGCTAATAT 3360

_G L S F N N V S D L H E I Y H I K K V D M N I D E M M Y L A C M R D S N F L T I%XICICGGT CGATCTTCATGAAATATC CCTGTAMAAGAGATAGCAATTTTAAACCA 3 480O

FIG. 2. Nucleotide sequence of four cross-hybridizing segments. Predicted ORFs are shown with amino acids in single-letter code. Thesequences correspond precisely with the expanded regions in Fig. 1. (a) Partial sequence of fragments RL and RK', displayed from right toleft according to the restriction map. (b) Partial sequence of the RJ fragment, displayed from right to left according to the restriction map. (c)DNA sequence of the leftmost part of RD' fragment, displayed from left to right according to the restriction map. (d) Partial sequence of theRD' fragment adjacent to the right TIR, displayed from left to right. Sequence continues on the following pages.

Page 4: Multigene Familiesin African Swine Fever Virus: Family 360

2076 GONZALEZ ET AL.

a

F Y C F V L G A NTrcTTtGT1'rTATTAGGGGCTAAI

L E L A K Q K N H=CTAGAATrAGCAAAACAAAAGAATCA'

A L _L K N Y R S KATGCCTTGTrAAAAAACTATAGATCTAA

I N R A M V T S V K

:ATCAATCGGGCAATGGTTACTTCGGTAAA

D I L V E I L S F KrGATATATIAGTAGAAATATIATCATA

N I M R Y K K L C PbAATATAATGAGGTATAAAAAGTTGTGTCC

K'360 M

kAAAC

hAGA7

CGAAI

A=TI~AGTATrrrAATTGAm ITTTrICGGAAATAATnATTGCAATATACAGATGCCA2

Y C I L E R C GTrATTGTATTTTAGAGCGTTGrGG

N V A L M X A VAAATIrGCATTAATGAAGGCTGTI

L C Q K L G A RCCTATGCCAAAAGTTAGGTGCGAGI

D N P L F L N NTGATAACCCCCTATrCCTAAATAA'

W Y S H A I L YTTGGTATAGTATGGCGATACTATA

L H E I Y N K ECCTTCATGAAATTTATAACAAAGA

I N R A M I T SCATTAATCGGGCAATGATrACCTC

N I L IN I L L

;T

;G

L W W H E A P I TTrGTGGTGGCATGAAGCCCCAATrACC

Q E N N H G L I EkCAGGAAAACAATCATGGTITAATAGAC

K A L S E N K I L;AAAGCTI'GAGTGAAAATAAAATTITf

A Q L K L R I F GrCTCAACTGAAATTAAGAATF1TrGGS

K L T E A I Q Y FrAAGCTrACTGAAGCCATCCAATATIT

K T N I D I D E MGAAGACGAATATAGACATTGATGAAATi

V M N F C E G N LGGTAATGAAMTrr=TGAAGGTAACTrT

F K N Y S P D S S

GATTCATCAT3

,CTGlTXACC

'AGAAATATIr

TGAACTAGAT.

'rTATCAACCA

I,ATGCAGTTG

rATICCTTLGT

F Y T N N L F F C I D L G A N A F E E STATACTAATAAcTITIATAGATAGGAGCTAATTTCGGAGA

F Y N 5 N V S L L S L, X T T D P E K I NWITrATAATUCAAACGTCTCT ATcTAAAAACGACAGAAAAAATTA

AATAATACCGTGGGCGAGATiTATfATATAATrACAGGCTrTATTAAAAATAGA

5 T L Q A L T K K V L A T Q P V F K D DTCTACTTACAAGCACTTACTAAAAAAGTACTAGCTACACAGCCTG=AAAGATGA

T C I D K Q I L I K T A S F K H G L T L'ACTTGTATAGATAAACAAATATrAATAAAAACAGCAAGTAAACATWTTTAACAT

EI G A D I S F G L V T V N M E C T Q D,rGAATSOGSGTCAGACATCAGSICITACATATGGAGTrGCAGTTA

-Y N V Q Y V K T S S N I I L C H E L L SrTATAATGTACAGTATG=AAAACTAGCAGTAATA¶TATrCTATGCCATGAATATAlC

T L S I N F T L D N I S F N E M L T R YNACATTATCAATCAACTrrACATrGGATAATATTTCATrCAACGAAATMCTAGA

Y S H F K D W R L I C G V A Y N N V F DkTATAGTCATrTTAAAGATTGGCGGTTAATATGGTTGCTTATAACAACGTCTTA

A C M Y D C N Y T T I Y Y C C M L G A D;GCCTGTATGTATGATTGTAATrATACAACTATATATTATTGTTGTATGTTGGGAGCA

M D L G A D A F E E S M E I A S Q T N NrATGGATrrAGGAGCTGATGCGTTGAAGAGAGCATGGAAATAGCGAGTCAAACGAATAA

I KTTDPEKINALLEEKYKSTrGGATA-rrAjTAAATATcTTATTATrrAAAAATTAcAGTcCAGA II L%-.-.LTTATCAATAP.AAACGACAGATCCCGCAAAAAAJl7.rAALI.-u-iFLAeuL-AbA;,PFaA%,.Ln.AvAws

K N M L I Y E E S L F H I Y G V N IAA50AATAT1TTAATATAT1AAGAATCAACATCTA0GGGAAACATAGAATAATAATCTTACCAATG=AAAAACCAT5010

J319 M L S L Q T L A K K A V ACCIZTCATATGTI TAGTCTCATACTAATA TTTTAAAAAAG;TAGGAACTrAATTGGTT7TrAAGAAGTTATCAT,GCTCTCCTTC CAAACCCTr,GCCAAAAAGGCIt:TG

K Q S V P E E Y H Y I L K Y C G L W W Q N K P I S L C H Y C N Y V I L S S T P FCCAAACAGAGCCT<;CCTGAGGAGT.ATCATTATATTTTAAAATATTCoCTTATrGGAAAACAAGCCCATTAGCTTATACTACTATTrAATTACGTTATTrTAAGCTCAACCCCCT

K G E L L H L D V A L I M A I K E N N Y D V I R L F T E W G A N I Y Y G L T C ATTAAGGGGGAAC77IATCTTGAIsTGCGTAATCAICCCCATAAMAAGAATAACTAATCATGTAAAAGCITGTACCGAATGGGGAGCAAkACAT'TAIATTAMACCT=T

R T E Q T Q E L C R K L G A K D G L N N K E I F A G L M R H K T S N N I I L C HCTAGGACGGAACAAACTICAGGAGC-iW-=-G-AAGTTAGGAGCTAMAGATrGITTAAATAATAAGGAAATTI TGCCGGTTAAT'GCGTCATAAAACGAGTAATAACATTATTTrATGTC

E I F D K N P M L E A L N V Q E M G E E I H R E L K L F I F Y I L D N V P M N IATCAAATAITTGATAAAAATCCTATGTTGAAGCTCTAAATGTG,CAGGAAATGGGAGAGGAGATTCATCGAGAGTTAAAGCTITTCATATTI ATATCTTGGATAATGTACCCATGAACA

F V K Y W Y A I A V K Y K L K R A I F F F Y Q T Y G H L S M W R L M C A I Y F NTATTCGTTAAATACTrGCTATG,CCATAGCAGTAAAATATAAGCTTAAAAGAGCITATICTTC7 ATAACATATG,GGCACCT TAGTATGTG;GCGACTCATC-TGCGCCAT'IACTTCA

N V F D L H E I Y E Q K I V H M D I D X M M Q L A C M Q D Y N F L T I Y Y C F VACAiATGA7TTGACC nADIACATATACGAWCAAAAGATCGTTICATATGGACAT'CGATAAAATrAAMCAGTTrOCCTATrGCAAGATTACAACTTTTAACGATATACTAC=

L G A D I D Q A I T V T Q W H Y H T N N L Y F C K D L K D L K Q N T L T A R P LTCT7GGAGCTGATATTGATCAAGCCATCACTIGTAACACAGTGGCATTATCATACGAACAATCTATA77 TAAGGA=AAAGGATlAAGCAAAATACTTTAACGGCACGTCCTC

L L P N I T D P K K I Y T M L K N Y L P T S S N S LlTtATTACCTAATATAACGGAT,CCTAAAAAATATATACCATGTAAAAATTACCTACCAACATCGTsCAAATTCTICTATGAGTCAGTCAATTIGATTT_7r, AGTAC:GAAC

GrAAGTITATIrAGCATCATGCAAACATACCGCTAAATrITATACTAAACATAATA 1137

FIG. 2-Continued.

amino acid sequences translated from the ORFs were A search of the National Biomedical Research Foundationaligned by the progressive alignment procedure of Feng and protein sequence data base was carried out by using theDoolittle (10). Two conserved stretches are apparent in this algorithm of Wilbur and Lipman (29) and the FASTA pro-alignment (Fig. 5): GLWW at position 40 and LFTEWG at gram (22). Both methods failed to identify any entry in theposition 94. This, along with the striking conservation of protein data bank with significant similarity to the sequencessome residues, probably reflects structural or functional derived from the ORFs of family 360. Also, a search with theconstraints (or both) in the evolution of these sequences. sequence motifs conserved among the sequences did notThe motif CXXLGA is found three times in the sequence render any significant results.and probably reflects an ancient internal duplication. Evolutionary relationships among different genes of multi-

J. VIROL.

3600

3720

3840

3960

4080

4200

4320

4440

4560

4680

4800

4920

120

240

360

480

600

720

840

960

1080

LCIT,

kA

I

rA

kT

kG

Cc

.-xzzxarr-rana*vf-sr-hrzaar&Artr-rlslrTbTArtArAr :AAGTATAA=l

Page 5: Multigene Familiesin African Swine Fever Virus: Family 360

VOL. 64, 1990 ASF VIRUS MULTIGENE FAMILY 360 2077

0D311 M L S L Q T I A K M A V A T N T

CTAr,GACAAAGAAAG;TATATATAGCCAATAATTATTCACTAAATTATTTCATACTGATGGGTATG,GAGCCATGITGTCTCTIGCAGACAATCGCGAAAAT'GGCCGTAGCAACAAACAC 12 0

y S K Y H Y P I L K V F G L W W K N S T L N G P I K I C N H C NN I M VG E Y P

MNCY NH G MS L D I A L I R AV K ER N I S L VQ L FT ENWG GN I D YGA L

TATGTGTTACAATICATG,GAAT'GAGCCTGGATATAGCTTTGATTCGGGCGGTAAAAGAGCGCAATATATCCTTAGTCCAGCTTTTCACCGAATGGGGGGGAAATATTGACTATGiGGGCACT 360

C R N T P S M Q R L C K S L G A K P P K G R M Y M D A L I H L S GD T L NwAD NADACL 8

I R G Y E I F D D N S V L D C V N L I R L K I M L T L K A R I P L M E Q L D Q I

A LK QL L Q R YWYAM AV Q HN L TTA I HY F DN H I PN IKP F SLR C

TIGCCTTAAAACMACTTCTGCCAGCGATACT'GGTAT,GCCAT,GGCT,GTACAACACAACTTAACAkACAGCTATCCACTATTTTATAATCATATT'CCTAATATAAAGCCATTTAGTCTGCGCTG 72 0

A L YF N DPFK I H DA C RT VNM DP N EMM N I ACQ Q DL NFQ S I Y Y

TGCTTGTTTTAATATCCCTTTAAAATCCATG,ATGCTTCAGGACTGTAAATAT'GGATCCTAATG;AGATGATGAACATT,GCTTGTCAACAGGATTAAACTrrCAAAGCATTITACTA 840

S Y I L GA D IN Q AM L MS LK Y G NL S N M WFC I D L GA DA FK EA GATAGTATATTTTAGGGGCTGATATITAATCAGGCTATGCTAAT,GTCTTAAAGTATGGAATCTTTCTAATATGTGTTTTCATAGATTTGGGCGGATGCCTTTKAAAGAGAGCAGGGGC 960

GCTTA GGGAAAAAAAAAGAGTGTTACAGCACATATTAGGTCTAATATCTTAAGCGGGAGTTGATTCCCCCCTGTAAAGATCCTATCCTTATCAAATCCAAATTCTGTITAAAAA 1080

ACTACATTCTAAAAAATGTC 1100

D 363HM P S T L Q V L A K K

TTTAATAATTGATGACTAAAATCATATTATAATGCCGTGCAAAAAATAATTATTTTTGGTITAAAGGATACCTTAAATAAAAAAcGAT{;CATCcACTCTAcAAGTGCITGCTAAAAA 120

V L A L G E H K E N E H I S R E Y Y Y H I L K C C G L N W H E A P I I L C F D GG=CINGCCTAGGGAACATAAAGAAAATIGAACATATATCTAGAGAATATTATrTCTATTAAGG7C GTAD,GCATGAGGCTCCGATTATACTGTTGATGG 24 0

S E Q M M I K T P I F E E G I L L N T A L M K A V Q E N N Y E L I N L F T E W GGAGTGAGCAAATGATIGATAAAGACTCCAATlS=--lAAGAAGGCATATITACTTAATA=CAGCITAATGAAAG CTGTACAGGAGAACAATTGATAT,AlCl:TCACTGAATG=G 360

A N I N Y G L I S I N T E H A R D L C R K L G A K E M L E R N E V I Q I V F K TAGCAAACATCAATTATIGGATTAATTTCATTAATACTAGCATGCCCGGGACCTAT'GTCGAAAATTAGGAGCTAAAGAAATGCTTAAAGAAATGAAGTTATACAA,ATTGAMAAAAAC 48 0

L D D I T S S N I I L C H E L F T N N P L L E N V N M G E M R M I I H W R M K NATTAGATGATATCACCAGTAGTAATATAATTTTATGTCATAATITATTACCAACAATCCTCTTTTAGAGAATGTAAATAIrGGGGGAAATGAGGATIGATAATTCATTIGGAGGATGAAAAA 60 0

L T N L L L N N D S I S E I L T K F W Y G I A V K Y N L K D A I Q Y F Y Q R F MTIAACGAACCTATITATTAAATAATGACTCTATITAGTGAAATATTAACTAAATITCTGGTATGGTATAGCAGTAAAATATAATCTTAAGGATGCAATCCAATATTTTACCAGAGATTCAT 72 0

D F N E W R V T C A L S F N N V N D L H K M Y I T E K V H T N N D E M M N L A CGGACTrCAACGAGT~GGGAGTAACAT IcITIT AATAATGTGAATrGATCTTCATAAGA13T'GAAACAGAGAAGGTITCATACGAATAATGACGAAATGATIGAATCTAGCCTG 840

S I Q D R N L 5 T I Y Y C F L L G A N I N Q A M L T S V L N Y N I F N L F F C ICAGCATTCAAGACAGAAATTAmACAACCA mAIrA IAT11 GCTAACATCAATCAAGCAATGTTACCTCAGATATTAATTACTTClT 960

D L G A D A F E E G K T L A K Q K G Y N E I V E I L S L D I I Y S P N T D F S SAGACTTADGGGGCTGATGCCTTTiGAAGAGGGTAAGACCCTGGAACAAAGGAATGAAACA IGAAATCTTATCATTAATACATTTATAGTCCAAATACTGACTTCTC-lATC 1080

K I E P E H I S S L L K N F Y P K N L F A F D R C N P G L Y Y SAAAAATAGAACCTGAACATATTAGTTClTTTGTAAAAAACTTTATC CAAAAAATCTGTTCGCTTTTGATCGTTGCAATCClWMATATTAI-L-lAGAGGACCGCTACAAAAATTAT 1200

-TI-I-I-.-vLii,TCAAAGCTCCAAAATAATTAT rAGATTAAAGTCGCCTATAGCAGCTGCCCACTCClAAAAAGTATTTTATAGTACAAAAAACACGAAAAATAATTTGCGGCCGGCG 1320

GCAACTATGTGrI AACTTAAT17AIIATA'=AA,CAACCATGGATTGTGACATCAGGGAGAAGAACTATAGCTACATCATATT GTCAATACTGGTAATA 1440

CTATTAATATGGTATCTTATACTTAACTATTCTCGATCGAAAATGCAGTTACAAACAACATGCCGCCACCAGCGTACACGGTGTCAAGTAGC-lU-lT-lAATAATAGGGTTGATCG 1560

A GA'AAATCGGGAACTATCCGITAATATATC;TTGATTGACGCCCACTTATGAATGGAAGTAATGATrAAATCGGGT 1680

:IaGCTAGGGCl AAACAAAGCGTGTATITTAAGGCCTATAGCAAGAGTAIG'TT AATM ACACCTACAACAGTAATATTTAAGGCCAGTAAAATAATGTTAA 1800

D 42 M P T P L S L Q A P A K K V L A T Q H I S K D H L Y7TAAGGCCTGACCACTAAAACTTAAACGATTTTGTAAAAAAA TATIG CCTACTCCAC= ACAGGCTCCCGCTAAAA AAGTACTGGCCACACAGCACATATTAGTACrA 1920

F E I L W F M V A F F D A Y S LGAATTTTGTTATGTGATTCTITATCGCCTACAGTCTTTAACGCCTGCAGTAATAATTGATATCTCCAGCG TAATATATACCCACAGCGGTAliTAATAATTG 20a4 0

ATCCCACAGCGTAT-CCAGC TGCGCCCC AAAAAAGTATTTC TWCALTGrGGTATACCGGCGGCGTAACACCAGrrATGGTTll TGGCCGCCGCCCAGCCG 2160

CAAAAAATCAATTACAGCCGCAAAAAAATATTTCCGGCCGcGGAGTllAcAAAAAAAATTAGCT=ATTGrAGAccAGGTcrGA GGCTTATCCTTTT 2 280

CTCGAATAAAAAA 2293FIG. 2-Continued.

Page 6: Multigene Familiesin African Swine Fever Virus: Family 360

2078 GONZALEZ ET AL.

- / " '/

°0 ' .-.254 52.5

60 8,522 '574

/ 626

O~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lA3t

sIL.~j /-.747./672, 6.7. 522:64.3

(4 /-/813''' ''/ 611 N - 23 67

K'360 K'362 L356 J319 0'311 0'363 0'42

FIG. 3. Dot matrix comparison of ORFs. The sense strand ofeach ORF was compared with all remaining ORFs with a windowcomparison of 20 and a stringency of 15. The percent identity ofeach diagonal was obtained by aligning the involved DNA segmentsby using the algorithm of Smith and Waterman (25).

gene family 360. Table 1 shows the percent identity derivedfrom the multiple alignment shown in Fig. 5. Pairs K'360-K'362 and L356-D'363 were the ones most related. This canalso be seen at the DNA level (Fig. 3).

K'360 K/362 L356

V 1 2 1 2 1 2

45-

2-1.3-

0.7 -

0.4 -0.2 -

TABLE 1. Percent identity between predicted protein productsderived from multiple alignment

% Identity with:Protein product

K'362 L356 J319 D'311 D'363

K'360 61.8 55.8 44.4 33.5 48.3K'362 57.1 47.6 34.8 52.8L356 48.1 36.1 61.4J319 37.5 45.1D'311 37.2

Figure 6 shows a genealogic tree derived from the multiplealignment of the predicted protein products of each gene.Clearly, there was no grouping of the sequences according totheir location at the left or right part of the genome. Instead,the branching pattern of the tree suggests that genes locatedat opposite positions in the restriction map have a commonorigin. To confirm this further, we used the sequence of theORF D'42, which was not included in the genealogic treebecause of its very different size. Pairwise alignments ofD'42 with all the other protein sequences clearly showed thatD'42 is very similar to K'362 (data not shown). This rela-tionship can also be seen in the dot matrix comparison of therespective DNA segments (Fig. 3).

DISCUSSION

A search for repetitive sequences in ASF virus DNA hasled us to determine the nucleotide sequence of severalregions located close to the ends of the DNA molecule. Theanalysis of these sequences indicated that the repetitionswere related to six genes with lengths of 311 to 363 codingtriplets. These genes were homologous, as is shown bycomparisons both at the DNA level (Fig. 3) and at the

J319

1 2

D311

1 2

.1ea

t

636361..2..l 2

L1.a.

TIPR K L U X.VYZU'XI

- J LJII

0-k-bnskb

CS6',~~ <2a [31D[ -R6 -J:* * * 0

FIG. 4. Transcription of the members of multigene family 360. RNA from infected cells at early (lanes 1) or late (lanes 2) times was

hybridized with oligonucleotide probes specific for each gene. Sizes are indicated in kilobases. V, RNA from uninfected Vero cells; D,positions of different ORFs in the sequences. 0, Locations of the sequences of each probe within the genes.

4/Sv-'--D' TIR

J. VIROL.

i

Page 7: Multigene Familiesin African Swine Fever Virus: Family 360

VOL. 64, 1990 ASF VIRUS MULTIGENE FAMILY 360 2079

70K '3 62 MSTPL ALA KIL TQH I SKNHY F K YC HGA X MFSTNEDNQLMIKSAIFKK'360 MPST ALT KVL TQPVFKDDYC mERC HEA TTIHHTCIDKQILIKTASFKL356 MQPST ALA RAL TQHVSKDDY Y iERC HEA SIY IDDDNQIMIRTLCFKD'363 MPST VLA KVL LGEHKENEHISREYYY KCC HEA ILCFDGSEQMMIKTPIFEJ319 MLS TLA KAV KQSVPEEY HY KYC QNK iSLCHYC NYVILSSTPFKD'311 MLS TIA MAB tNI'YSKYlHYP KVF KNSTLNG KICNHCNNIMVGEYPMCYNDl'42 MPTPLS APA KVt. lQl I FKDJH1.YFEL1W FMVAFFE)AVFL

140K1362 DC LE MNL MK VQ N YDLIE AD NSSLN/TVNTCHTWNF RE K'KILNEMDIVQIFYKIHK'360 HG LT NV MK VQ N HGLIE AD SFGLVTVNMECTQDL QK 'RKALSENKILEIFYNVQL356 EG Ik MNTVL VK N EDLIM AN NYGLLFINNEHTRNL RK 'KEELETSEILRFFFETKD'363 EG IL NT' MK VQ N YELIN AN NYGLISINTEHARDL RK KEMLERNEVIQIVFKTLJ319 GELLH DV IM IK N YDVIR AN YYGLTCARTEQTQEL RK 'KDGLNNKEI FAGLMD'311 HG M DI IR VK R ISLVQ GN DYGALCRNTPSMQRL KS 'KPP KGRMYMDALIHLS

210K'362 RIKTSSNI LCHKLLSN PLF Q IEELKI IICCFLEKISINFI NEITLNEMLARL SM VRYH TEK'360 YVKTSSNI LCHELLSD PLF L NAQLKLRIFGELDTLSINFT DNISFNEMLTRYSM ILYK TEL356 CKITSSNV LCHELFSN PFLQNV MVDLRM IIYWELKDI-PTNSM NEISFSEMLTKY GI VKYN KED'363 DDITSSNI LCHELFTN PLLENNMGEMRMIIHWRMKNL TNLL NNDSISEILTKF GI VKYN KDJ319 RHKTSNNI LCHEIFDK PMLEAL VQEMGEEIHRELK LFIFYI DNVPM NIFVKY AI VKYK KRD'311 DTLNDNDL RGYEIFDD SVLDCV LIRLKI MLTLKAR IPLMEQ DQIALKQLLQRYCAM VQHN TT

280K'362 2QY YQRYRHFKDWRLI GLSF NVSDL E IYHIKKVDMNIDE YL' MR S FLT F CFV NIK'360 *QY YQPYSHFKDWRLI GVA NVFDL E IYNKEKTNIDIDE QL' MYI YTT 1lYCCM DIL356 EQ CQEYRHFDEWRLI ALSF NVFDL E ICNTTKIHMSINK EL' MRIN FLTDCFA NA

D'363 QYYQRFMDFNEWRVT ALSF NVNDl KMYITEKVHTNNDEMNL' SIQIR LS1TfYCFL NIJ319 aFF YQTYGHLSMWRLM AIYF NVFDL E IYEQKIVHMDIDK QL' MQC FLTUYCFVDID'311 XHV DNHIPNIKPFSLR ALYF DPFKI DAC RTVNMDPNE NI' QQ L FQS3 SYI DI

350K'362 NR MVTSVKNFYTN LFF IEGANAFEESLELAKQKNHDILVEILSFKDFYNSNVSLLSLKTTDPEK NK'360 NR MITSVMNFCEG LFL M GADAFEESMEIASQTNNWILINILLFKN YSPDSSLLSIKTTDPEK NL356 NR MLISVKNFCIE MFF M GANVIEHSKTLADIYGYSIIVNILSLKI YKANPILLS KETNPEK ND'363 NQ MLTSVLNYNIF LFF I GADAFEEGKTLAKQKGYNEIvEILSLDIIySPNTDFSS KIEPEH SJ319 DQ ITVIQWHYHTN LYF KE KDLKQNT LTARPLLLP NITDPKK YD'311 N MLMSLKYGNLS MWF I G.ADAFKEAGALAGKK KKSVTAH R

381K'362 ALL KNYRSKNIMRYKKLCPK I IRWARF IIK'360 ALLDEEKYKSKNMLIYEE SLFHIYGVNIL356 TLL KNYYSKNMLAYD ICCIDNYLD`363 SLL KNFYPKNLFAFD RCNPGLYYSJ319 TML KNYLPTSSNSLD'311 S

FIG. 5. Multiple alignment of predicted protein products for multigene family 360. Residues which are conserved among all the proteinsof multigene family 360 are boxed in black.

protein level (Fig. 5 and Table 1). In addition, a small ORF20.3 K'360 (D'42) homologous to the 5' end of the remaining members

6K736of the family was present. The sequences obtained accountfor both the cross-hybridization between restriction frag-

20 K,362 ments RK', RL, RJ, and RD' (1) and the previously reportedK 362 internal inverted repetitions (26).The hybridization of RNA of infected cells with specific

17.4 L3S6 probes for each gene indicated that most members of multi-gene family 360 are transcribed during infection, although to

I different extents (Fig. 4). In addition, the different sizes and2.5 2l4.5 . intensities of the bands obtained after hybridization of RNAO 363 isolated at late times suggest that early and late transcription

are independently controlled for each gene.Comparison of the putative protein products of the genes

31.3 J319

FIG. 6. Genealogic tree for proteins of multigene family 360.66.5 '3 Tree construction was carried out as described in Materials and0311 Methods. Evolutionary distance for each branch is indicated.

Page 8: Multigene Familiesin African Swine Fever Virus: Family 360

2080 GONZALEZ ET AL.

dano for advice in designing computer graphic programs and R.Sanchez and A. Zazo for technical assistance.

This work was supported by grants from the Comisi6n Intermin-isterial de Ciencia y Tecnologfa, European Economic Community,and Fundaci6n Ram6n Areces.

TIRD

TI R _

K 360 K 362 L356 J319 D'311 D363

FIG. 7. Model for evolution of multigene f

steps in the evolution are shown schematicallygenes, and parentheses indicate deletion even

involve gene duplication at one DNA end, foll

divergence of duplicated genes. In a second ste

sequences from one DNA end to the opposite on

Present genes and the evolutionary relationships

accounted for by additional sequence diverg

deletion event.

(Fig. 5) revealed a striking conservation of

stretches. This suggests the existence of s

to maintain a common overall structure in

family since the time of their divergence

ancestor.The arrangement of the genes in the vira

as their transcription direction is compati

pothesis that they once constituted an enh

been well documented that poxvirus genon

tions at one end and duplication of sequenc

end, thus generating TIR of larger sizes (3,|

evolutionary relationships among memb4

family 360 also support the idea that dupli

location of sequences between ASF virus

occurred.A simplified model for the evolution of f

implies gene duplication within one DNA

cation of sequences between both DNA en

deletion event accounts for the presence

ORF D'42. This model may be rather sin

larger evolutionary distance between D'31

pared with the evolutionary distance be

D'363, indicates that multiple translocation

taken place.Multigene families have been described

large DNA-containing viruses such as hum

rus (28) and Shope fibroma virus (27). The

of Shope fibroma virus is organized simil

family 360 in ASF virus, since its membern

the ends of the genome and are transcriDNA ends. In fact, part of the multigene

fibroma virus is contained within the i

repetitions and therefore could represent

step in our evolutionary model (Fig. 7).

The role of multigene family 360 in AS]

unknown. Experiments to detect the preinfected cells are in progress.

ACKNOWLEDGMENTS

We are grateful to R. F. Doolittle and J. Fels

computer programs for genealogic analysis. We

{TIRl4WT1iRE1 LITERATURE CITED

1. Almendral, J. M., F. Almazfin, R. Blasco, and E. Vinuela. 1990.Multigene families in African swine fever virus: family 110. J.

m Virol. 64:2064-2072.

2. Ahmendral, J. M., R. Blasco, V. Ley, A. Beloso, A. Talavera, andRE E. Vinuela. 1984. Restriction site map of African swine fevervirus DNA. Virology 133:258-270.

D42 3. Archard, L. C., M. Mackett, D. E. Barnes, and K. R. Dumbell.1984. The genome structure of cowpox virus white pock vari-

amily 360. Severalants. J. Gen. Virol. 65:875-886.

.Arrows represent 4. Blasco, R., M. Aguero, J. M. Almendral, and E. Vinuela. 1989.ts. The first steps Variable and constant regions in African swine fever viruslowed by sequence DNA. Virology 168:330-338.

ep, translocation of 5. Devereux, J., P. Haeberli, and 0. Smithies. 1984. A comprehen-e wouldtake place. sive set of sequence analysis programs for the VAX. Nucleic

among them can be Acids Res. 12:387-395.

,ence and a single 6. Eck, R. V., and M. 0. Dayhoff. 1966. Atlas of protein sequence

and structure. National Biomedical Research Foundation, Sil-ver Spring, Md.

7. Enjuanes, L., A. L. Carrascosa, M. A. Moreno, and E. Viniuela.some amino acid 1976. Titration of African swine fever (ASF) virus. J. Gen.,elective pressure Virol. 32:471-477.each ORF of the 8. Esposito, J. J., C. D. Cabradilla, J. H. Nakano, and J. F.from a common ObiJeski. 1981. Intragenomic sequence transposition in monkey-

pox virus. Virology 109:231-243.il genome as well 9. Felsestein, J. 1985. Confidence limits on phylogenies: an ap-ible with the hy- proach using the bootstrap. Evolution 39:783-791.

abged TIRt It hy- 10. Feng, D. F., and R. F. Doolittle. 1987. Progressive sequenceairged TIR. It has alignment prerequisite to correct phylogenetic trees. J. Mol.

nes undergo dele- Evol. 25:351-360.

,es from the other 11. Fitch, W. M. 1971. Toward defining the course of evolution:8, 14, 20, 23). The minimum change for a specified tree topology. Syst. Zool.ers of multigene 20:406416.ication and trans- 12. Fitch, W. M., and E. Margoliash. 1967. Construction of phylo-DNA ends have genetic trees. Science 155:279-284.

13. Gonzalez, A., A. Talavera, J. M. Almendral, and E. Viiiuela.amily 360 (Fig. 7) 1986. Hairpin loop structure of African swine fever virus DNA.amily60 (Fg. 7) Nucleic Acids Res. 14:6835-6844.end and translo- 14. Kotwal, G. J., and B. Moss. 1988. Analysis of large cluster of

Ids. Also, a single nonessential genes deleted from a vaccinia virus terminal trans-

of the truncated position mutant. Virology 167:524-537.

nplistic, since the 15. Ley, V., J. M. Almendral, P. Carbonero, A. Beloso, E. Vifiuela,.1 and J319, com- and A. Talavera. 1984. Molecular cloning of African swine fevertween L356 and virus DNA. Virology 133:249-257.events may have 16. Maizel, J. V., and R. P. Lenk. 1981. Enhanced graphic matrix

analysis of nucleic acid and protein sequences. Proc. Natl.before for other Acad. Sci. USA 78:7665-7669.

befcyoregforother 17. Maxam, A. M., and W. Gilbert. 1977. A new method forancytomegalovi- sequencing DNA. Proc. Natl. Acad. Sci. USA 74:560-564.

multigene family 18. Messing, J. 1983. New M13 vectors for cloning. Methodsarly to multigene Enzymol. 101:20-78.s are located near 19. Messing, J., and J. Vieira. 1982. The pUC plasmids, an

ibed towards the M13mp7-derived system for insertion mutagenesis and sequenc-family in Shope ing with synthetic universal primers. Gene 19:259-268.nverted terminal 20. Moyer, R. W., R. L. Graves, and C. T. Rothe. 1980. The whiteone intermediate pock(,u) mutants of rabbit poxvirus.III. Terminal DNA se-

quence duplication and transposition in rabbit poxvirus. Cell22:545-553.F viruls biology IS 21. Needleman, S. B., and C. D. Wunsch. 1970. A general method

)tein products in applicable to the search for similarities in the amino acid

sequence of two proteins. J. Mol. Biol. 48:443-453.22. Pearson, W. R., and D. J. Lipman. 1988. Improved tools for

biological sequences comparison. Proc. Natl. Acad. Sci. USA85:2444-2448.,estein for providing 23. Pickup, D. J., B. S. Ink, B. L. Parsons, W. Hu, and W. J. Joklik.

thank J. A. Baque- 1984. Spontaneous deletions and duplications of sequences in

TTIRI

ITIR

J. VIROL.

Page 9: Multigene Familiesin African Swine Fever Virus: Family 360

ASF VIRUS MULTIGENE FAMILY 360 2081

the genome of cowpox virus. Proc. Natl. Acad. Sci. USA81:6817-6821.

24. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequenc-ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci.USA 74:5463-5467.

25. Smith, T. F., and M. S. Waterman. 1981. Identification ofcommon molecular subsequences. J. Mol. Biol. 147:195-197.

26. Sogo, J. M., J. M. Almendral, A. Talavera, and E. Vihuela. 1984.Terminal and internal inverted repetitions in African swine fevervirus DNA. Virology 133:271-275.

27. Upton, C., and G. McFadden. 1986. Tumorigenic poxviruses:analysis of viral DNA sequences implicated in the tumorigenic-ity of Shope fibroma virus and malignant rabbit virus. Virology152:308-321.

28. Weston, K., and B. G. Barrel. 1986. Sequence of the shortunique region, short repeats, and part of the long repeats ofhuman cytomegalovirus. J. Mol. Biol. 192:177-208.

29. Wilbur, W. J., and D. J. Lipman. 1983. Rapid similaritysearches of nucleic acid and protein data banks. Proc. Natl.Acad. Sci. USA 80:726-730.

VOL. 64, 1990