Upload
bennett-bruce
View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Base stacking classification via automated clustering
method Eli Hershkovits1, Xavier Le Faucheur1, Neocles Leontis2, Allen Tannenbaum1
1Georgia Institute of Technology, 2BGSU
Data Classification
• Coordinate system and parameterization
• Clustering of the data (“by eye” or Automated clustering)
Base stackingRing Coordinate system
• the three orthogonal directions are calculated with Cremer and Pople method.
• The coordinates y1 and y2 can be used to define face of the ring (up or down.)
X1
Y1Z1
X2
Y
2
Z2
r12
Base stackingRelative Coordinate system
• Relative rings coordinates are defined by the spherical coordinates r and
r
r r
Primary Classification
• For each base stacking candidate the two closest rings are chosen to represent the pair. This choice gives a classification to four groups: Pyrimidine-pyrimidine Pyrimidine-imidazole, Imadizole-pyrimidine and Imidazole-imidazole.
• There are four possible combinations of face-face interactions: Up-up, Up down, Down-up, Down,down.
Parameters relevant for clustering
0
20
40
60
80
100
120
140
160
1 22 43 64 85 106 127 148 169 190 211 232 253 274 295 316 337 358 379
0
20
40
60
80
100
120
1 22 43 64 85 106 127 148 169 190 211 232 253 274 295 316 337 358 379
r
Secondary classification
• The polar coordinates “r” , “” and “” are correlated and show distinction to two clusters” “Proper stacking” and improper stacking.
• Those classifications give 4*4*2 = 32 classes
Pyr - Pyr
Relative orientation
proper improper
UU 143C:G142 155C:C154
DD 511A:A509 743G:C699
UD 144A:G135 172U:G164
DU 147G:U146 897A:G765
Im - Pyr
Relative orientation
proper improper
UU 132A:A131 231G:C230
DD 2813A:A2811 2792A:U2791
UD 226A:A215 273G:C271
DU 174A:C173
Pyr-Im
Relative orientation
proper improper
UU 129A:A128 1360C:A1358
DD 129A:A116 2058G:G636
UD 176U:A174 922A:G921
DU 893G:G892 866U:A776
Im-Im
Relative orientation
proper improper
UU 159G:G158 223G:G222
DD 2564G:A2513 1190G:A1189
UD 1626A:A1624
DU 1664A:G1663
Possible problems
• For stacking of residues that are not neighbors the distribution of is broad.
• Possible overlap between clusters.