11
India n Journa l of Chemistry Vol. 4 113 , Nove mb er 2002, pp. 23 46-2356 Information-bearing base-pair sets invo lving hydrogen-bond ed nitrogen hetero cycles : A theoretical mod elling study Deborah L Buam & R H Dun an Lyngdoh* De pa rtment of Che mi stry, No rth Eastern Hi ll Uni ve rsity, Sh ill ong 793 022, India Received 10 Decelllber 200 1; accepted (rel'ised) I I J Uli e 2002 Thi s study represe nt s a search for info rill a ti on-bearing sets of hydrogen-bonded base pairs constructed frolll nitrogenous heterocyc li c bJses. The PM 3 SCF- MO theoret ical method is used here to model 36 different I-1 -bonded ba se pai rs, bui lt up from s ub s tilUt ed pyr id in e, pyrimid in e Jnd pyrazine bases and grouped into numerous sets. Out f these, o nl y six se ts arc pred icted to success ful ly reproduce th e inforillation-bea rin g characteris ti cs of th e DNA base pair sc I. These are termed as "DNA base pa ir mi mic sets", providing clues to th e design of sy nthe ti c inforilla ti on-bearing macrom0lec ul cs. Biologic al macromol ec ul es like DNA , mR NA and proteins all bea r information, encoded or ex pressed. Information-bearing macromol ec ul es ma y al so be artificiall y des igned or sy nthesi sed, yielding man- made al tern ati ves which co uld ha ve useful app li ca tions in this age of information technology. In the world of li fe , DNA is th e prim ary inf ormation- bca ring mac romol ec ul e. Thi s stu dy tak es clu es from its str uc ture and f un ction to initi ate th e de sign of o th er al tern ati ves . Using sem i -e mpiri ca l MO th eo ry, we here co nduct a sea rch for we ll -defined se ts of hy drogen-bonded nit rogenous base pair s which mimi c th e W atso n-Crick base- pair s of DN A In th ei r in form ati on-bea ring and se lf-rep li ca ting aspec ts. Thes e unique as pects ari se from th e hydrogen-bonde d compl ementary ba se- pa iri ng 1.2 which provid es th e structural and fu nctional basis for D NA. Th e two Watson-Crick bas e pa irs, A:T and G:C, consti tute a definit e se t po ssess ing th e fo ll ow ing distincti ve fe atu res assoc iated with th eir in fo r ma tion- bea rin g and se l f- rep l i ca ti ng rol es. (a) T he co mpo nent ba ses ar e aromatic nitr ogenous heterocycl es which hydrogen-bond among them se l ves form i ng base pa i rs. (b) All pai rs in th e se t po ssess simi la r pa IrI ng configurations, retai ned even upon rever sa l of th e pa ir, e.g. G:C to CG. (c) Within t hi s co nfiguration, each ba se ca n pa ir only wi th its single complementa ry ba se, and not with any o th er co mponent base. (d) Each pa ir co ntains at l eas t two hydro ge n bond s for ma intaining fi xe dness i n con fi gura tion (le ss likely for base pa irs bo nd ed by a single H-bo nd ). In thi s paper, any set of di fferen t ba ses which ful fils the above c rit eri a is des ignat ed here as a "0 A base pair mimic se t", or simply a "D NA mimi c se t" . Su ch a se t where th e tw o const ituen t bases belong to th e same general type of mol ec ular ri ng sys tem may be term ed a 'Ty pe 1 D NA mimic se t". W hen th e constituent ba ses of a pa ir belong to di fferent mol ec ul ar ri ng sys tems, th e se t is termed a "Type II DNA mim ic se t" . T he set of th e four DNA bases const itut es a Ty pe II se t, puri nes pairing only wi th pyrimidin es and vice ve rsa. Thi s th eo reti ca l mode lli ng stud y seeks to des ign such se ts of nit rogeno us ba ses which qualit ati ve ly f ulfil the essential crit eri a for a D NA mimi c se t (w heth er Type I or II ). Thi s is a prelu de to the des ign of ar tificial infor ma tion- bea ring mac ro-duplexes which possess, at l eas t in principl e, th e in fo rm ati on- bea ring and se l f- repl icati ng char ac teri stics of natural DNA. While an effecti ve back hone for the macromol ecu le is yet to be des igned, we ass ume tentati ve ly that even a deoxyr ibo se ph os ph ate backbone ma y f un ction adequately, sin ce th e backbone structure wo ul d no t affect the in form ati on-beari ng and se l f-repl i ca ti ng char ac ter of th e put at i ve macro-duplex. We mu st distinguish betwee n "D NA ba se analogues" and " 0 fA ba se pa ir mimie se ts". D A base analogues are mol ecu l es who se topogr ap hy and hydrogen-bonding properti es re se mble those of normal DNA ba ses, thu s all ow ing for comfort ab le accommodati on into th e DNA double-helical co nfig ura ti on when subs tituting for th e normal bases. The term " DNA ba se pa ir mimic se t" is here used to defi ne a se t of base s which pair among themselves in

Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

India n Journa l of Chemistry Vol. 4 113 , November 2002, pp. 2346-2356

Information-bearing base-pair sets involving hydrogen-bonded nitrogen heterocycles : A theoretical modelling study

Deborah L Buam & R H Dun an Lyngdoh*

Depa rtment of Chemi stry, North Eastern Hi ll Uni versity, Sh ill ong 793 022, India

Received 10 Decelllber 2001; accepted (re l'ised ) I I JUli e 2002

Thi s study represents a search for inforillati on-bearing sets of hydrogen-bo nded base pairs constructed frolll nitroge nous heterocyc li c bJses. The PM3 SCF- MO theoret ica l method is used here to model 36 different I-1 -bonded base pai rs, bui lt up from substilUted pyrid ine, pyrimid ine Jnd pyrazine bases and grouped into numerous sets. Out f these, onl y six sets arc pred icted to success ful ly reproduce the inforillat ion-bea ring cha rac teristi cs of the DNA base pair scI. These are termed as "DNA base pa ir mi mic sets", provid ing clues to the des ign of sy ntheti c inforilla ti on-bearing mac rom0lec ul cs.

Biological macromolecules l ike DNA, mRNA and proteins all bear information, encoded or ex pressed. In formati on-beari ng macromolecules may also be arti ficiall y des igned or synthesised, y ielding man­made al tern ati ves which could ha ve useful applications in thi s age of in formation technology. In the world of li fe , DNA is the primary informati on­bcaring macromolecule. Thi s study takes clues from its structure and funct ion to initiate the design of other altern ati ves . Using sem i-empirica l MO theory, we here conduct a search for well -defined se ts of hydrogen-bonded nitrogenous base pairs which mimic the Watson-Cri ck base-pairs of DN A In thei r in formati on-bearing and self-rep li ca ting aspects. T hese unique aspects ari se from the hyd rogen-bonded complementary base-pairi ng 1.2 which prov ides the structural and fu nct ional basis for DNA. The two Watson-Crick base pairs, A:T and G:C, consti tute a definite set possess ing the fo llow ing distincti ve featu res assoc iated w ith their in formati on-bearin g and sel f- repl icati ng roles. (a) T he component bases are aromati c nitrogenous

heterocyc les wh ich hydrogen-bond among themse l ves form i ng base pa i rs.

(b) A ll pai rs in the set possess simi lar paIrI ng configurat ions, retained even upon reversa l of the pa ir, e.g. G:C to CG.

(c) W ithin thi s configurati on, each base can pa ir only with it s single comp lementa ry base, and not w ith any other component base.

(d) Each pa ir contains at leas t two hydrogen bonds for ma intaining f ixedness in con fi gura ti on (less li kely for base pa irs bonded by a single H-bond).

In thi s paper, any set of di fferen t bases which ful f il s the above criteri a is des ignated here as a "0 A base pair mimic set", or simply a " DNA mimic set" . Such a set where the two const i tuen t bases belong to the same general type of molecular ring system may be termed a 'Type 1 DNA mimic set". When the constituent bases of a pair belong to different molecular ri ng systems, the set is termed a "Type II DNA mim ic set" . T he set of the four DNA bases constitutes a Type II set, puri nes pairing only w ith pyrimidines and vice versa. Thi s theoreti ca l modelli ng study seeks to des ign such sets of nitrogenous bases which qualitati ve ly fulfil the essent ial criteri a for a DNA mimic set (whether Type I or II ). Thi s is a prelude to the des ign of artificial informati on-bearing macro-duplexes which possess, at leas t in principle, the in formati on-bearing and sel f- repl icati ng characteri stics of natural DNA. Whi le an effecti ve back hone for the macromolecu le is ye t to be des igned, we assume tentati ve ly that even a deoxyribose phosphate backbone may f unction adequately, since the backbone structure would not affect the in formati on-beari ng and sel f-repl ica ti ng character of the putat ive macro-duplex.

We must distinguish between " DNA base analogues" and " 0 fA base pair mimie sets" . D A base analogues are molecu les whose topography and hydrogen-bond ing propert ies resemble those of normal DNA bases, thu s allow ing for comfortab le accommodation into the DNA double-heli cal conf igura tion when substituti ng for the normal bases. T he term " DNA base pa ir mimic set" is here used to defi ne a set o f bases which pair among themselves in

Page 2: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

BUAM el (II.: INroRMATIO BEARING SETS OF HYDROGEN-BONDED BASE PAIRS 2347

a spec i f ic and non-ambiguous manner such as to reproduce the essential qualitati ve features o f the D A base pa irs outlined above. The actual configuration of the mimic pairs, of course, may be quite di ss imilar to that present in DN A.

A n earli er paper3 focussed on how H-bonded sel f­assoc iati on of nitrogenous bases could furni sh a basis fo r designing informat ion-bearing duplexes. In thi s study we allow for hetero-assoc iati ve pairing as well. Three different nitrogenous rin g systems are studi ed here, viz., the pyri dine, py rimi dine and pyrazine fami lies . H-bonded pai ring between di fferent bases of the same famil y (pyridine-pyri dine, pyrimidine­pyrimidine and pyrazi ne-pyrazine pairing schemes) can lead to Type I DNA mimic se ts. Pairing between bases of different fam ili es (pyridine-pyri midine, pyrimidine-pyrazine and pyrazine-pyridine) wou ld lead to Type II D A mimic sets. -g lycoside fo rmati on is not yet exp licitl y considered here, the wou ld-be glycoside N-C linkage being replaced simply by an N-H bond.

M ost theoreti ca l work on hydrogen-bonded base pairing between nitrogenous bases has restri cted itself to the study of DNA and RN A systems, the Watson­Crick base pairs, A:T and G:C receiving max imum a llent i on~ · lo Chemical mod i fication of these pa irs has also been studi ed in the context of mutagenes is and ca rci nogenes is II .I~ . I ncorporat ion of backbones other th an ribofuranose phosphate (l ike pyranose phosphate) has also been modelled ' 5. ' 7. Dillerent types of nucleic ac ids (li ke homo- DN A, p-RNA, Z­D A and B-DNA) have been studied experimentall y and theoreti ca ll y. However, th e H-bonded pa iring properti es of aromat ic nitrogenous bases other than the natura l nucleic ac id bases is st ill relati ve ly unstudied theoreti ca ll y, especially in the contex t of design ing novel in formation-bearing macromolecules .

Thi s study also reports va rious structural fac tors affecti ng fac i I i ty or hydrogen-bonded base-pai ri ng. We suggest th at H-bond linearity and part icipation of oxygen would enhance H-bonding, while H-bond non-linearity and participation of fluori ne wou ld diminish H-bonding capac ity here.

Thcoretical Mcthodology Structura l elements assoc iated with H-boncling in

nucleic acid bases include the amino and carbonyl groups and ring nitrogens. Other non-b iolog ica l poss ibiliti es include the N-ox icle and halide groups. Such clements were incorporated into the structures of various pyridines, pyrimidines and pyrazines to create

novel hydrogen-bonded pairing situati ons. Simple planar cut-outs served initiall y as crude models to explore pairing poss ibilities, using conceptu al aids like steri e crowding, the di stance requirements for hyd rogen-bonding, and repul sive close- range interactions between electronegative atoms or between hydrogen atoms. Certai n well-defined base pai ring moti fs were observed to arise out of these ini tia l trial s.

For designing Type I DNA mimic sets, carefull y chosen sets of 11 di fferent bases each, all of a sing le ring system type, were paired ex hausti ve ly among themselves w ithin the constraints o f a single appropriately chosen pairing configuration. A ll the "e2 poss ible different pairs were screened to di sall o\V any pairs incorporat ing steri c crowd ing, close-range lone-pa ir repulsions or hydrogen-hydrogen interac tions, the allowed pamng moti fs being retained. For des igning T ype II DNA mimic sets, 1/ 1

bases of one ring type were exhaustively paired with 11 2 bases of another ring type lead ing to 11, 11 2 di fferent pairs, all again subjected to some parti cular configurat ional constraint adopted for the set. Screening of these 11 , 11 2 pairs led to retention of a fe\V allowed pairing motifs not possess ing the disqual i fy i ng factors descri bed above. T hese sketched pa iring schemes assumed co-p lanarit y of the bases in each pair, vvhich may subject to deviat ion when actuall y worked out by theoreti ca l ca lculations.

A ll component bases and their allowed pai ri ng moti fs were subjected to molecular orbital ca lculations using the semi-empiri ca l PM 3 SCF-MO methodl ~ and unconstrained geometry optimi sati on by the Dav idon-Fletcher-Powell algorithm l

().2o The PM 3

method is useful and reli able among semi-empirical methods for the study of H-bonded interac ti ons between nitrogenous bases like D A bases2

1.22. The PM 3 data for hyd rogen-bond ing geometry and facilit y as we ll as the intermolecular confi gurati on of each pair studi ed are defined below.

W ithin any pair, a hyd rogen bond between a hyd rogen atom H (covalently altached to an elec tronegati ve atom A of one base) and an electronegati ve ato l11 B of the oth er base may be represented as A -H ... B. H -bond geometry is defined here in terms of Rhb (the distance between atoms H and B), R"h ( the di stance between the two elec tron egative atoms A and B), and ~lh ( the angle of the hydrogen bond A-H ... B ). Sim il ar defi nitions appl y to an H-bond represented as A .. . H-B. Atom numbering follows the convent ion of des ignating the

Page 3: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

2348 INDIAN J. CHEM. , SEC B, NOVEMBER 2002

would-be N-glycos ide nitrogen as I, each exocyclic group and its hydrogens being numbered according to the ring atom to whi ch it is attached. An H-bond is identifi ed in terms of the atoms actuall y H-bonded together, viz., H and B for the A-H. .. B case, or A and H for the A ... H-B case. Each H-bond i. rec koned starting from the base on the left goi ng to the base on the ri ght as represented later in the Figures 1-7.

The pairing enthalpy Ep is calcul ated from the PM 3 heats or rormat ion of the pair and its component bases. We propose that participation of oxygen in H­bonding, shorter H-bond lengths and H-bond linearity (6!,b close to 1800

) would enhance H-bondi ng. Three configurati onal markers defin e co nfi guration

of a pair, viz., ( i ) RIlIl· the distance between the nitrogens of each base whi ch would fo rm the glycos ide bonds, (i i ) the angles 8, and fh between the

\ \

\ \

\ \

\ \

\ \

RIlIl vector and the two would-be glycos ide bonds, and (i ii ) ¢J1l'" the di hedra l angle incorporating the two would-be glycoside bonds through th e N- vector. These are portrayed (Figure 1) fo r a pyridine­pyrimidine pair being rec koned from left to ri ght. Des ired confi gurat ional criteri a fo r a DNA base-pair mimic set (whether Type 1 or II ) are: (a) similar va lues of RIlIl fo r all pairs, (b) similar va lues of 8, and 82 within all pairs to allow fo r revers ibility of each pair, and (c) values of ¢Jllh near zero or 1800, indicating co-planarity of the two ba~e co mponents, as generally occurs for Watson-Crick pairs.

Results and Discussion Initi al arb itrary design led to the selec tion of 3S

di fferent nitrogenous bases (monomers) fo r study here (9 py ridines, II py rimidines and IS pyraz ines), which

Rnn

Figure 1- Confi gu rational markers Rnn , 8" 82 and <Pnh illustrated for a pyrimidine-pyrimidine pairs

H~H H - / "'-N O----H-N I - tt H ~-H - ----F- r; 0

H -H H

A1

~ H

H H~_H ____ O*H O*F- ---H-N H

N \ J H H

" A2

Figure 2- Configurational schemes for pyridine-pyridine pairs

Page 4: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

BUAM el al.: INFORM ATION BEARING SETS OF HYDROGEN-BO DED BASE PA IRS 2349

Set 1

H H

H~N----- H-N~ H

N-{ >=< I O-- - --H-N~ F O

~N o "' 81

Set 2 H

I

\ O-----H-N H

N-f '>-=< H~-H-----N}-N)=O

H O-----H-N " , H

83

Miscellaneous

H H

"-N~O----H H r-N 'N---L o H-----O=\-<-H

85 ~ 0

H 0

H~N----H-/ H

N-f >=< I f'J-H----N:, )=0 H }-N

H , 82

H \ O-----H-N' H

O=<N-{_H-----N~H >=< }-N

H N-H-----O " I

H 84

86

Figure 3- Configurational schemes for pyrimidine-pyrimidine pairs

were paired among themselves giving a total of 35 unique base pairs which survived the screening process. The 15 Type I pai rs were grouped into 5 sets, each set hav ing its ow n dist inct confi gurat ion plus miscell aneous sets of 2 pyridine-pyridine pairs and 2 pyrimidine-pyrimidine pairs. Likewise, the 20 Type II pairs compri sed 5 di stinct confi gurati onal sets, plus a miscell aneous set of 3 pyridine-pyridimine pairs. The miscellaneous co nfigurati ons were studi ed here purely for compari son.

We assume the base pair confi guration itse lf serves as a major determin ant of structure and stability for the duplex macropolymer, occurring if the backbone is flex ible enough. Stac king interactions and base seq uence are deemed here as of secondary importance. Since all pairs are studi ed without expli cit incorporation of the N-glycos ide bonds, these bonds are represented in the Figures 1-7 simply by a pointed arrow.

Search for Type I DNA mimic sets These incl ude the pyridine-pyridine, the

pyrimidine-pyrimi dine and the pyraz ine-py raz ine pairs, discussed below.

Pyridine-pyridine pairs. The pairs Al and A2 (Figure 2) built up from 4 different pyridine rings have no ring nitrogens available for H-bonding (being termini fo r the would-be N-glycos ide bonds). Their H-bond data (Table I) predict rather small pairing energies, with the smaller Ep va lue for A2 associated with H-bond non-linearity. Participation of fluorin e in H-bonding is another possible dimini shing factor. The dihedral ¢nh (Table II ) predicts essenti al pl anarity for A2, but not AI. Table II also predicts di ss imilar confi gurati ons for these pairs, thus di squalifying them from constituting a DNA mimic set.

Pyrimidine-pyrilllidine pairs. The 6 pyrimidine­pyrimidine pai rs Bl to B6 (Figure 3) are grouped into the sets I and 2, with two miscell aneous pairs. Set I

Page 5: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

2350 INDIAN J. CHEM., SEC B, NOVEMBER 2002

Set 3

o H H ,H

~N N- - --- ·H-N H

Set 4

>=< >=< H F----·H-N N~

C1

>=< H H

H / o F-----H-N H

H >=< -"-N N-----H-N N~

>=< >=< H F---- -H-N, H H

C3

Miscellaneous

H H

>=<

C6

o H

H ~ -..-N N-----H-N H

>=< >=< H N-H---- -N N~

H H

H I

C2

H F-----H-N 0

H 0

H H N-H-·-- --F 0

>=< .K >=< H "'-N N-H-----N N~ -"-N N-W----N- N~

>=< >=< H F-----H-N H \ H

>=< >=< H F--- - -H-~ H H

C4 C5

H H H 0

>=< H -..-N N-H-----N N~

>=< >=< H ~-H-----F H H

C7

Figure 4- Configurational schemes for pyrazi lle-pyrazine pairs

Set 5

'H 0 I O*F-----H-N~>=O

H /N-H- ----O~H H H

O*~~~~_-~:~p.=o H H

01 02

Miscellaneous

~ H

*"\ /H

o of------H-N I - lr-N

H ~N-H - ----O=-N>==<=O

03 H H

Figure 5- Conrigurational schemes for pyridine-pyrimidine pairs

Page 6: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

BUAM er (II.: INFORMATION BEARING SETS OF HYDROGEN-BO DED BASE PAIRS 2351

Set 6

H H H H H H I

O-----H-N H

>=< O------H-N N~

>=<

H N-H-----O==(,H H~O------H-NI H

(-=C >=< >=< ...-N>=<O-----H-N N~ "'-N O------H-N N---"

H - IN-H-----O~H H - IN--H------O~H "'-N

H O---- --H-N H \ H H H H H H

E1 E2 E3

Set 7

~\ H H

o _ "\ F--- -- -H-N>=f

H O--- --H-N N---"

H >=< E4 H H

Set 8

~ H H H

o~"\ F------H-N>=<N---.. - >=< H O------H-N, H

H H

~ H H H

*F-----H-N>=<N_ H N-H------OrH

H H

E6 E7

Figure 6- Configurat ional schemes for pyridine-pyrazine pairs

compri ses fo ur pyrimidine bases with 10 pairing possibilities, where on ly pairs Bl and B2 survive screening. Set 2 compri ses fo ur different pyrimidines with 10 pai ring poss ibilities, of which on ly B3 and B4 survi ve screen ing. Pairs B5 and B6 have mi scell aneous configurations. All bases here in volve one ring nitrogen in H-bonding, the other being the would-be N-glycos ide bond terminus.

The appreciable va lues of Ep Crable III ) predict fac ile H-bonding for most pairs. Absence of flu orine H-bonds in all 7 pairs is one fac ilitat ing fac tor, besides part icipation of carbonyl oxygens. Pair B5 with onl y two H-bonds (neither very linear) has a pairing energy of -4.95 kcal mol"l , due to the presence of on ly oxygen H-bonds. Pairs B2 and B4 have more linear H-bonds bonds. B3 and B4 (three H-bonds each) also di sp lay appreciab le stability (EI' va lues of -5 . 11 and -3.84 kcalmol"l respectively).

Table IV pred icts good pl anar configurations for pairs Bl , B3, B4 and B6. Set I and set 2 are di stinguished in confi guration primaril y by their different va lue ranges for the ¢nh marker. Set I (Bl and B2) presents dissimilar 81 and 82 values, di sa llowing pair reversibility. The two pairs of set 2 (B3 and B4) furni sh a likel y DNA Type I mimic set, judgi ng from their rather simi lar Rnn va lues (7.3 19 and 7.440 A.), similar 81 and 82 values within the three pairs ( 130.4° to 139.6°), and ¢nh values close to 180°.

Pyra z. ine-pyra z. ine pairs. The 7 pyraz ine-pyrazine pairs Cl to C7 (Figure 4) are grouped into sets 3 and 4. Set 4 has two subsets 4a and 4b characteri sed by three and by two H-bonds respecti ve ly. Set 3, compri ses three pyraz ine bases, with pairs C l and C2 retained among 10 pai ring possibiliti es, pair C2 being se lf-assoc iat ive but un ique. Likew ise, only C3, C4 and C5 were reta ined for subset 4a, and onl y C6 and

Page 7: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

2352 INDI AN J. CHEM ., SEC B, NOVEM BER 2002

Set 9

H HL I \ I; N

H>=<N-H----N>=(=O

~N N-H------F H

>=< H H F1

o I ~N

H>=<F---"-H-N~O

~N N-- - -H-N H H H o H

F3

Set 10

H H H ~

>=< ~N "-N N-H- ----N:

1 '>=0

A 'r=-< H ~-H---- --F H

H

F5

o H 0 ~

'h< ~N ..-N N-----H-N~ ,>=0

>=< >=< H N-W----r H I

H

F7

o ~ ~N

>=<F-----H- N>=(=O

"'-N N-H----- -F H

>=< H H F2

F4

H \

o N-H--- -O I 'h< ~N

-+-N N-----H-N~ )=0

>=< >=< H F- - ---H-N H \

H F6

H H H ~ >=< }-N ~N N-H---- -N:/ )=0

>=< >=< H F-- ---H-N H \ H

Fa Figure 7- Configurational schemes for pyrazine-pyrimidine pairs

C7 fo r sub-set 4b. In all these pairs, one ring nitrogen is a terminu s of the would-be glycos ide bond, and the other (p ara to the first) avail able for H-bonding. The H-bond data (Table V) predi cts rather smal l va lues of the pairing enthalpy Ep (-0.4 1 to - 1.35 kcal mor l

)

li nked to the complete absence of oxygen H-bonds in all these pairs, bes ides parti cipat ion of tluorine H­bonds in all pairs except C2. Pairs Cl , C3, CS, C6 and C7 also possess markedl y non-linear H-bonds. Pair C4 (three linear H-bonds) di splays the largest Ep va lue here (- 1.35 kcalmor l

) .

Table VI predi cts non-planarity for pairs C l , C2, C4 and es pec iall y C7, demonstrating th at th e co­planari ty in i ti a ll y ass umed wh i Ie sc reeni ng may be los t dur ing geo metry optimi sa ti on. Set 3 and set 4

are distingui shed by va lues of he Nnn and ~bnh

markers. The ()l and (h. va lues ( 172° to 179°) fo r set 4 indicate near co-linea rity of the would-be N­glycos ide bonds.

A poss ible Type I mimic set is represented by pairs Cl and C2 of set 3, hav ing simi lar va lues for R nn

(8.737 and 8.8 14 A), for ()l and ()l (I (13° to 166°), and fo r ~bnh ( 16 1.25° and 150.13°). Pairs C3 and CS of subset 4a represent another choice, with R nn va lues or 8.371 and 8.46 1 A, ()l and (h. va lu es all close to 180°, and ¢nh values close to zero; the devian t ¢nh va lue or 55 .05° excludes C4 from thi s subset. Pairs C6 and C7 of subset 4b are di squalified by their di ssimil ar ¢ nh

val ues (1 3.6° and 83.3°). Suitabili ty of the DNJ\

Page 8: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

BUAM et ul. : INFORMATION BEARING SETS OF HY DROGEN-BONDED BASE PAIRS 2353

Table 1-PM 3 H-bond da ta fo r the subst ituted pyridine­pyridine pairs II

Pair Ep

Al - 1.43

A2 -0.61

H-bond

04-H2

H3-F3

H4-02

F3-H3

2.0 17

1.962

1. 867

1.9 12

3.0 16

2.95 1

2.880

2.703

~'b 175. 16

169.05

137.83

165.40

(/ Ep in kea1 mor l ; Rhb and R"b in A ; ehh in 0

Table II - Configu rati onal data for the substituted pyridine­pyridine pairs"

Pair

A l

A2

Pair

BI

B2

B3

B4

B5

B6

B7

Rnn

9.27 1

8.484

171

164.50

155.00

~ 107.65

99.55

11 9.07

176. 19

Table III - PM3 H-bond data for the substituted pyrimidine-py rimi dine pairs II

Ep H-bond Rhb

Set I

-3.68 N3-H4 1.8S4 2.807 155.78

02-H3 1.846 2.85 1 165.93

-3 .88 N3- H4 1.83 1 2.852 176.65

H2-N3 1.813 2.828 176.4 1

Set 2

-5.1 1 02- H4 1.922 2.894 169.37

H3-N3 1.969 2.95S 164 .29

04-H2 2.008 2.948 159.75

-3.84 02-H4 1.867 2.872 166. 14

H3-N3 1.877 2.90S 173.06

H4-02 1.964 2.826 149.99

- 1.29

-4.95

-4. 16

Miseell ancous

N3-H3

02- H4

H3-04

02-H3

N3-H3

H4-04

1.898

1.860

1.955

1.856

1.945

1.983

2.974

2.869

2.844

2.7 12

2.9 17

2.893

170.39

169.89

148.26

148 .23

157. 19

159.74

"Ep in ken1 mor l; Rhb and Rob in A; ehb in 0

mimic sets Cl/C2 and C3/CS is, however, marred by their small pairing energy va lues.

Selected Type I DNA mimic sets. This search for Type I DNA base pair mimic sets leads to the cho ice of the set 2 (pairs B3 and B4), set 3 (pairs Cl and C2) and set 4a (pairs C3 and CS) as like ly candidates. Set 2 constitutes a bette r choice than the others, hav ing larger pairing enthalpies and bette r co-pl anarity.

Pair

BI

B2

B3

B4

B5

B6

B7

Table IV - Confi gurat ional da ta for the subst itutcd pyrimidine-pyrimidine pairs"

Rnn 171

Set 1

7.343 100.38

7.304 103.37

7.440

7.3 19

Set 2

130.69

130.35

Miscellaneous

7.499 139.08

7.548 140.6 1

7.334 11 3.03

137.98

133.59

139.60

135.68

136.24

99.68

8. 13

19.76

173.05

178 .63

163.S9

25.22

120.54 2.97

Table V - PM3 H-bond data fo r the subst ituted pyrazine­pyrazine pairs II

Pair

Cl -0.9 1

C2 -0.68

C3 -0.41

C4 -1.35

C5 -0.51

C6 -0.60

C7 -0.69

H-bond Rhb

Set 3

N4-H3 1.880

FS- H4 2.002

N4-HS 1.969

HS-N4 1.974

Subset 4a

F3- H3 1.831

N4-H4 1.763

FS- HS

F3- H3

H4-N4

F5- H5

H3- F3

H4-N4

F5- H5

1.925

1.854

1.84 1

2.066

1.843

1.834

US 0

Subset 4b

H4-N4 1.862

FS- H5 2.069

H4-N4 1.963

H5- FS 1.953

(/ Ep in kea1 mor l; Rhb and R ub in A; ehb in 0

Search for Type II ONA mimic sets

2.972

2.896

3.023

2.962

2.602

2.742

2.606

2.832

2.96 1

3.02 1

2.926

2.883

2.790

2.746

3.024

2.882

2.9 13

174.29

153.91

169 .25

168 .98

127.10

167 .52

123.12

178.88

179. 18

178.58

174 .02

178 .54

lS5 .61

lS7 .46

143.35

159 .S3

157.15

Thi s category incorporates pyridine-pyrimidine patrs, pyridine-pyrazine patrs and pyrimid ine­pyrazine pairs as described below.

Pyridine-pyrimidine pairs. Set 5 compri ses two pyridines and two pyrimidines, g iving 4 Type If pamng poss ibilities, where pairs 01 and 0 2 (Figure S) survi ve the sc reening. Pairs 03, 04 and OS rep resent miscellaneous confi gurations, al l

Page 9: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

2354 INDI AN J. C HEM., SEC B, NOVEMBER 2002

in vo lving pyrimidine -oxides. Pyri dine nng nitrogens in all pairs are not H-bonded. One pyrimidine ring nitrogen is available for N-g lycos ide fo rmati on and another for H-bondin g. Pyrimidine ring nitrogens parti cipate in H-bonding through attached hydrogens (D1 and D2) or through N-oxide oxygens (D3, D4 and DS).

The larger Ep values Cfable VII) for D3, D4 and OS (comparc D1 and 02) pred ict that -ox ide groups en hance H-bonding facility, the negative charge on the N-oxide oxygen augmenting electronegati vity. Pairs 01 , D2 and D3 involving fluorine H-bonds have relatively smaller Ep valucs than D4 and OS with only oxygen H-bonds. Pairs D1 and D3 also possess fairly non-linear H-bonds.

Tablt~ VIII predicts approximate co-planarity for all these pairs except DS. Pairs D3 , D4 and DS all adopt different confi gurati ons, and so cannot

Table Vl- Configurat iona l data for thc subst it utcd pyrazi nc­pyrazi nc pai rsb

Pair Run 01 O2 ¢oh

Sct 3

Cl 8.737 165. 16 163 .30 16 1.25

C2 8.8 14 166.03 162.9 1 150. 13

Subsct 4a

C3 8.37 1 17 1.79 172.69 9.33

C4 8.378 177.92 178 .96 55.05

CS 846 1 178 .39 174.37 947

Subsct 4b

C6 8.376 177.85 170. 14 13.60

C7 8465 174.02 174.28 83.26

b R oo in A; OhiJ.?and ¢oh in a

Table Vll-- PM 3 H-bo nd data for thc subst ilUt d pyridine-

pyrim id inc pairs " Pair Ell H-bond R hh 1<"" L-l'h

Sct 5

01 -0.8 1 F5 - 11 3 2.034 2.lJ3 1 1504 1

H4- 04 1.954 2.960 158.78

D2 -0.68 F5 - 1-1 3 2.079 3.057 169.61

04- H4 2.097 3.05 1 164.37

Mi scc ll ancous

03 -2.03 F3- 1-1 2 2.025 3.094 152.30

H4-03 1.937 2 .829 149.19

O-t -2.58 04-1-1 2 1.824 2.8 18 165 .68

1-1 5-03 1. 823 2.lJ34 169.69

05 ~ 3.3lJ 1-15-03 1.884 3.099 16 1.23

03- H5 1. 846 2.930 173.7 1

" Ep in kca lmorl : Rilh and R"il in A ; o hI> in a

Table VIIl- Confi gurati ona l data for thc substituted py ridinc- pyri mi dinc pair,h

Pa i I' R,", 81 0:. ¢ol>

SetS

01 8.520 124.25 11 9.72 5.98

02 8.608 122.83 118.84 7.29

M i sec ll uneous

1)3 8.945 135.50 103 .6 1 15.55

04 9.021 174. 10 112.10 168.25

05 9.735 174.83 177.68 7846

constitute a DNA mimic set. Pairs 01 and D2 (set 5) have similar configurations, with comparable Rnn values (8.520 and 8.608 A), simi lar ()I and fh va lues ( 11 8.8° to 124.3°), and ¢nh values near 0° (indicating approximate co-planarity). Set 5 th us qualifies as a Type II D A mimic set, hav ing, however, small Ep values (-0.81 and -0.68 kcalmor l

).

Pyridine-pyrazine pairs. The 7 py ridine-pyraz ine pairs E1 to E7 (F igure 6) are grouped into sets 6, 7 and 8. Set 6 (three pyridines and three pyrazines) generates 9 Type II pairing possi bIl ities, where E1 , E2 and E3 survive screening. Set 7 (two pyridines and two pyrazines) has four pairing possibilities, where E4 and ES are retained. Set 8 (two pyridines and two pyrazines) leads to pairs E6 and E7 surv iving screening of the four poss ibil ities. The H-bond data (Table IX) predict th at none of these pairs have strong H-bonds (Ep ranging betweell -0.34 and - 1.46 kcal mor l

). For E4, ES, E6 and £7, the small Ep values may be linked to the non ·-linear fluorine H­bonds presen t.

Table X predicts non-planarity ror all these pairs (¢nh rangi ng from 36.2° to 96_2°). Pairs E1 , E2 and E3 (set 6) have broadly simi lar va lues of Rnn, but E2 has dissimilar 01 and fh val ues (un li ke E1 and E3). These pairs also exhibit divergent ¢lIh values (36.2° to 96.2°), indicating dissimilar configurations. However, pairs E4 and ES (set 7) have sim ilar Rnn values (9.648 and 9.70 1 A), so mewhat comparable ()I and ()l val ues ( 132.7° to 143.8°) and ¢Ilh va lue!' iffer ing by only 5.2°. Set 7 thus constitutes a plaus ibl e candidate for a Type II DNA mimic set except for its rather sma ll pairing energ ies. Pairs E6 and E7 (set 8) have comparable RIlIl and ¢Ilh values, but dissimilar va lues of ()I and ()l for E7 ( 127.8° ane! 147.6°), thus providing drawbacks to suitab il ity here_

Pyl'il7lidiJ/e-pyra zin e pairs. The pairs F1 to F8 (Figlll·e 7) are groLl ped into sets 9 and 10, each

Page 10: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

BUAM el (I/.: INFORMAT ION BEA RI NG SETS OF HYDROGEN-BO DE D BASE PA IRS 2355

Table IX - PM3 H-bond da ta fo r the substit uted pyrilllidine-py raz ine pairs"

Pair Ep H-bond Rhb i?"h

EI

E2

E3

E4

E5

E6

E7

Pai r

EI

E2

E3

E4

E5

E6

E7

-0.44

-0.34

- 1.07

H5-02

04-HI

H3-06

03-1-12

04-HI

OS-H6

06-H3

HI -04

1-I2-OS

Set 6

1.975

1.799

2.075

1. 863

1.865

1.767

1.799

1.7 11

1.869

Set 7

2.957

2.799

3.069

2.866

2.834

2.774

2.770

170.23

172.78

169.90

17 1.1 4

160.63

176.00

166.23

2.7 13 164 .84

2.853 164 .84

-0.95 F5- H3

04-H4

- 1.1 2 F5- H5

H4-04

1.9 10

1.923

1. 856

1. 8 14

2.832 152.23

Set8

-1.06 F5- H4 1.970

04-H5 1.846

- 1.46 F5- H4 1.796

1.777 H4-OS

2.872

2.84 1

2.79 1

2.843

2.682

2.560

2.826

156.3 1

159.35

158.9 1

149.58

140. 13

132.08

173.2 1

Table X - Configurat ional data for the subst iw ted py ridine-pyrazine pa irs b

9.786

9.6 16

9.737

9.648

9.70 1

9. 165

8.937

el O2

Set 6

175.98 179.33

144.80 162.6 1

177. 19 178. 10

Se t 7

143.75 132 .70

142.92 133 .72

Set8

134.77 136.05

127 .76 147.62

36. 15

96. 19

62.52

66.58

61.34

78.54

85 .1 5

compri sing 4 pyrimi dines and 4 py razines ( 16 pa iring poss ibilities each). O nly pa irs FI to F4 are reta ined fo r set 9, and pairs F5 to F8 for set 10. H-bo nd data (Table XI) pred ict rathe r poor H-bonding for mos t pairs except F7 with three H-bo nds and Ep va lue of -7.64 kcal mo r ). Carbonyl oxygen H-bonds are absent in a ll pairs except F6, whil e fluorine H-bonds are present in all pa irs except F4 .

Table XII predi cts good co-planarity fo r most pa irs. Among the 4 pairs of set 9, FI and F3 have the

Pa ir

1'1

F2

1'3

F4

F5

F6

F7

F8

Table XI - PM3 H-bolld data fo r the substi tuted pyrilllid ine-pyrazine pairs"

-11 8

- 1.94

-1.24

-0.75

- 1.1 3

-7.64

-2.08

- 1.1 2

H-bond

H3-N3

H4-F4

F3- H3

H4- F4

F5- H3

N4- H4

H5-N3

N4- H4

H4-N3

H2- F4

H3-02

N4- H3

F5- H4

N4- H3

H5- F4

H4-N3

F3- H4

Sct 9

1.97 1

1. 876

1.847

1.838

1.798

1. 85 I

1.794

1.690

Set 10

1. 894

1. 84 1

1.8 12

1.807

1.8 14

1.796

1. 8 19

1 867

1.845

2.903

2.823

2.910

2.930

2.722

2.850

2.872

2.885

2.9 14

2.925

2.823

2.832

2.808

2.826

2.837

2.9 13

2.830

172.68

174.09

172.60

169.23

179 .1 6

170.44

173.57

177.44

169.72

176.56

179.18

178.49

168.58

178 .58

165 .02

177.87

178.84

Table XII - Configurati onal data for the substituted py rilllidi ne-pyrazine pai rs b

Pa ir

FI

1'2

1'3 1'4

1'5 1'6

1'7 1'8

8.3 18

8.655

8.489

8. 182

7.802

7 .845

7.856

7.563

el

Sel 9

14 1.38

148. 15

14 1.88

139.59

Set 10

173.20

172.93

172. 11

170.27

154.56

156.37

15 1.07

155.82

122.6 1

127.82

128.27

122.73

5.06

5.44

4.65

5.5 I

4.35

18.28

8.4 1

5. 13

c losest va lues fo r R"" (8. 31 8 and 8.489 A) , and fo r 0) and fh. ( 14 1.4° to 154.6°), bes ides ¢J"h va lues d iffe ring

by onl y 0.5°. Thi s prompts the cho ice of FI and F3 as a Type II D A mi mic set. In set l a, F5 to F7 have

comparabl e R"" values, but di ss imil ar 0) and fh. , whil e F8 shows a comparative ly shorter R"" value (7.56 A).

Selected Type II DNA mim ic sets. This search for Type II DNA mimic sets leads to the cho ice o f the pyridine-pyrimidine pa irs DI and D2, the pyri di ne­pynw ne paIrs E4 and E5, and the pyrimid ine-

Page 11: Information-bearing base-pair sets involving hydrogen ...nopr.niscair.res.in/bitstream/123456789/22095/1... · between nitrogenous bases like D A bases21.22. The PM3 data for hydrogen-bonding

2356 INDIAN J. CHEM. , SEC B, NOVEMBER 2002

pyrazine pairs F1 and F3 as constituting three candidate sets, but all have pairing energies on the small side.

Relevance for encoding infOl-mation We apply a three-letter code word situation here (as

in nature) . The Type I set comprising pairs B3 and B4 is built up from four different pyrimidine bases (a four-letter alphabet), which leads to a lexicon of 43 or 64 code words. The Type I set comprising C1 and C2 is built up from three different pyraz ine bases, in ferring a three-letter alphabet and a lex icon of 33 or 27 code words. The Type J set B31B4 and the Type II sets C3/CS, 01/02, E4/ES and F1!F3 all have four bases each (four-l etter alphabets) leading to 64 word lexicons.

The informati on-bearing rol e of D A is a complex biological phenomenon, the detail s of which would be imposs ible for sc ience to simulate artifi cially in the near future. It is imaginable, though, that informati on­bearing macromolecular duplexes like those env isaged and built up from base pairs like those designed here, may well find application some day in the world of in fo rmati on technology, where storage and ret ri eval of information even at the molecular level may one day become a reality.

Conclusions Factors predi cted to favour H-bonding within the

nitrogenous base pairs studied herv include H-bond linearity as well as participation of carbonyl oxygens and N-ox ide groups. Participation of flu orine in H­bonding is predi cted to dimini sh H-bond strength.

This study advances some sets of base pairs as li kely candidates for "DNA base pair mimic sets" as per our definition. The two pyri midine- pyrimidine pa irs B3 and B4 constitute the best candidate set here because of their good pairing energies and base pair co-planarity. Two pyraz ine-pyrazine pairs Cl and C2 furni sh a less sui table Type I DNA mimic set. Candi dates for Type If DNA mim ic sets include the pyridine-pyrimidine pairs 01 and 02, the pyridine­pyraz ine pairs E4 and ES, and the pyrimidine­pyraz ine pairs F1 and F3, all deemed less suitable than the Type I set B3 and B4_ These sets can all

potentially encode information , leadin g to triplet word dicti onaries of variously 27 or 64 words. Further theoretical work will involve constructing polymeric duplexes out of these selected sets or pairs to des ign arti ficial macromolecules which simulate the information-bearing feature of DNA.

Acknowledgement The authors thank the Computer Centre, North­

Eastern Hill Uni versity , Shillong, fo computer time on the V AX 3 100-40 system. One of the authors (DLB) thanks the UGC, New Delhi , for fin ancial assi stance.

References I Watson J D & Cri ck F H C, Narure , 171. 1953, 737

2 Watson J D & Crick F H C, Nalure , 17 1. 1953,964

3 Buam D L & Lyngdoh R H D, ) Mol Slm el (THEOCHEM), 505 , 2000, 149

4 Gould I R & Kollman P A , ) Am Chem Soc, 11 6, 1994, 2493

5 Hobza P, Sponer J & Polaski M , J Am C/lI'm Soc , 11 7, 1995, 792

6 Hobza P & Sandorfy C, ) Am Chem Soc . 109, 1987, 1302

7 Lcach R & Koll man P A, ) Am Chl'm Soc . 114, 1992, 3675

8 Sponer J & Hobza P, ) Phys Chem , 98, 1994, 3161

9 Roben G A & M aclagan R, Ausl ) Chem , 32, 1979, 1635

10 Braender R & Enek i J, 1111 ) Qualllllm Chem Quanlilm Bioi SYIllP , 46, 1993, 499

II Vcnkatcswarl u D & Lyngdoh R H D, ) Chem Soc Perkill Trw/s, 2 1995, 839

12 Vcnkateswarlu D, Bansal M & Lyngdoh R H D, ) Chelll Soc Perkin TraIlS, 2 1997, 62 1

13 Ford G P & Wang B, 1111) Qualllum Chem , 4 1, 1995, I 02

14 Ford G P & Wang B, ) Mol Slmcl (THEOCHEM ), 283, 1992, 49

15 Esehenmoser A & Lowenthal E, Chem So(' Rev, 2 1, 1992, I

16 Eschenmoser A & Dobler M , Helv Ch il li Acta , 75 , 1991 , 2 18

17 Bohringer M , Roth H -J, Hunziker J, Gobel M , Krishnan R, Giger A, Schweize r B, Schreiber J, Lellmann C & Eschen­moser A , Helv Ch illi Acta , 76, 1992, 14 16

18 Stewart J J P, ) COIllPUI Chem, 10, 1989, 406

19 Fletcher R & Powell M J D Compuler ) , 6, 1963,1 63

20 Davidon W C, Compuler), 10, 1968,406

2 1 Li vely T N, Jurem<l W M & Shields G C, 1111 ) Quolllum Chem Quolllum Bioi Symp, 2 1, 1994, 9S

22 Jurema M W & Shields G C, ) Compul Chelll , 14, 1993, 89