集中講義(九州大学数理学研究院) バイオインフォマティクスにおける カーネル法およびグラフ理論 (4) タンパク質立体構造の比較と予測

  • Upload
    gyala

  • View
    41

  • Download
    3

Embed Size (px)

DESCRIPTION

集中講義(九州大学数理学研究院) バイオインフォマティクスにおける カーネル法およびグラフ理論 (4) タンパク質立体構造の比較と予測. 阿久津 達也. 京都大学 化学研究所 バイオインフォマティクスセンター. 講義内容. 立体構造比較(構造アライメント) RMSD stralign 発見的手法: SSAP/DALI/CE/ … Contact Map Overlap 問題 構造のマルチプルアライメントの困難さ 立体構造予測 予測法の分類 スレッディング法 プロファイルを用いるスレッディング ポテンシャル型スコア関数を用いるスレッディング CASP. - PowerPoint PPT Presentation

Citation preview

  • (4)

  • )RMSDstralign: SSAP/DALI/CE/Contact Map Overlap

  • SCOPFSSPDALICATHSSAP

  • (1)

  • (2) vs.

  • RMSD(Root Mean Square Deviation)e.g., C O(n)

  • : stralignHGC [Akutsu 1996] : P=( p1,, pm ), Q=(q1,, qn), m n P,Q M T

  • stralign M0 {}for all triplets PP=(pi1,pi2,pi3) from P do for all triplets QQ=(qj1,qj2,qj3) from Q do Compute rigid motion TPP,QQ from PP to QQ Compute alignment M between TPP,QQ(P) and Q if |M| > |M0| then M0 MOutput M0

  • TPP,QQ PP=(p1,p2,p3)QQ=(q1,q2,q3) TPP,QQ p1 q1 PP p1p2 q1q2 PP PP QQ PP p1p2

  • T(P) Q M

  • (1) PP=(p1,p2,p3), QQ=(q1,q2,q3)T |T(pi) - qi| (i=1,2,3) p reg(p1,p2,p3) |T(p) - q| |T PP,QQ(p) - q| 8

  • (2) MOPT O(n8) M T MOPT P,Q P,Q P reg PPP MOPT PP QQ T(P) Q 8 |M||MOPT| 8

  • stralign O(n8) sparse DP O(n5)

    PP,QQ 1020 fragment TPP,QQ rmsd O(n2) O(n2)DP= O(n4) rmsd DP rmsd fitting

  • SSAP (Double Dynamic Programming)DP [Taylor & Orengo 1989]High level DP Sij low level DP Low level DPpi P qj Q (i,j) ( Sij[i,j] = 0 )DP pk , ql ki, lj skl a,b skl = a / (| | pk - pi | - | ql qj| | + b)

  • DALI (Alignment of Distance Matrices)Distance Matrix [Holm & Sander 1993]Distance Matrix P P Q distance matrix Simulated Annealing

  • CE (Combinatorial Expansion) [Shindyalov & Bourne 1998]VAST (Vector Alignment Search Tool) [Gibrat et al. 1998]DP+Iterative Improvement [Gernstein & Levitt 1998]StrMul ( [Daiyasu & Toh 2000]

  • Contact Map Overlap (CMO) (1){vi,vj}E vi vj (vi,uk) (vj,ul) {vi,vj}Eif and only if {uk,ul}E

  • Contact Map Overlap (CMO) (2)CMONP [Goldman et al. 1999] [Caprara et al. 2004]RNA [G-H. Lin et al. 2002] [Akutsu & Miyano 1999]

  • ( CE-MC, StrMul, ) (NP)NP(LCP) [Akutsu & Halldorson 2000] d S1, S2, , SN d C Si Ti T1(S1) T2(S2) TN(SN) = C

  • (ab initio)33.3%80%

  • NP

  • String Folding1/43/8 (Hart & Istrail, 1995) NP-Hard (Berger,Leighton,1998)NP-Hard (Crescenzi et al.,1998)1/3 (Newman, 2002)

  • Fold Recognition)fold)1000(Chotia, 1992)

  • -1D (Bowie et al., 1991)PSI-BLASTProfile-Profile etc.Frozen approximation, Double dynamic programming (Lathrop & Smith, 1996) (Xu et al., 2000) (Xu et al., 2003)

  • ()

  • (DP)

  • fd(X,Y)

  • MAX CUT G(V,E) U V-U UMAX CUT ALALAL

  • RAPTOR (1)Ming Li (Xu et al., 2003)CASP5, CASP6

  • RAPTOR (2)g(i,j,l,k)i l j k i j

  • RAPTOR (3)y(i,l)(j,k): i l j k xi,l: i l D[i]: i R[i,j,l]: i l j

  • CASPCASP (Critical Assessment of Techniques for Protein Structure Prediction)

  • CASPCASP1 (1994), CASP2(1996), CASP3(1998), CASP4(2000), CASP5(2002), CASP6(2004)CAFASP(1998,2000,2002,2004)http://prediction center.llnl.gov/(Proteins)(E.E. Lattman et al. 2003)

  • T. Akutsu, IEICE Trans. Information and Systems, E79-D, 1629-1636, 1996.T. Akutsu & S. Miyano, Theoretical Computer Science, 210, 261-275, 1999.T. Akutsu & M. M. Halldorson, Theoretical Computer Science, 233, 33-55, 2000.B. Berger, F.T. Leighton, J. Comp. Biol., 5, 27-40, 1998.J.U. Bowie et al., Science, 253, 164-179, 1991.A. Caprara et al., J. Comp. Biol., 11, 27-52, 2004.P. Crescenzi et al., J. Comp. Biol., 5, 423-466, 1998.H. Daiyasu & H. Toh, J. Mol. Evol., 5, 433-445, 2000.M. Gernstein & M. Levitt, Protein Sci., 7, 445-456, 1998.J-F. Gibrat et al., Curr. Opin. Struct. Biol., 6, 377-385, 1996.D. Goldman et al., Proc. IEEE Symp. Found. Comp. Sci. (FOCS), 512-522, 1999.W.E. Hart & S. Istrail, Proc. 27th ACM Symp. Theory of Computing, 157-168, 1995.L. Holm & C. Sander, J. Mol. Biol., 233, 123-138, 1993. R.H. Lathrop & T.F. Smith, J. Mol. Biol., 255, 641-665, 1996. E.E. Lattman et al., Proteins: Structure, Function, and Genetics, 53(S6), 2003.G-H. Lin et al., J. Comput. Syst. Sci., 65, 465-480, 2002.A. Newman, Proc. 13th ACM-SIAM Symp. Disc. Alg., 876-884, 2002.I.N. Shindyalov & P.E. Bourne, Protein Engineering, 11, 739-747, 1998.W.R. Taylor & C.A.Orengo, J. Mol. Biol., 208, 1-22, 1989.J. Xu et al., J. Bioinform Comput Biol., 1, 95-117, 2003. Y. Xu et al., J. Comp. Biol., 5, 597-614 ,1998.