25
Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +* , Sachi Kimura, Mikito Toda , Masami Takata + , Kazuki Joe + + Graduate School of Humanities and Science, Information and Computer Sciences, Nara Women’s University Departments of physics, Nara Women’s University

Clustering the Temporal Sequences of 3D Protein Structure

Embed Size (px)

DESCRIPTION

Clustering the Temporal Sequences of 3D Protein Structure. Mayumi Kamada +* , Sachi Kimura, Mikito Toda ‡ , Masami Takata + , Kazuki Joe +. +: Graduate School of Humanities and Science, Information and Computer Sciences, Nara Women’s University - PowerPoint PPT Presentation

Citation preview

  • Clustering the Temporal Sequences of 3D Protein StructureMayumi Kamada+*, Sachi Kimura, Mikito Toda, Masami Takata+, Kazuki Joe++Graduate School of Humanities and Science, Information and Computer Sciences, Nara Womens UniversityDepartments of physics, Nara Womens University

  • OutlineMotivationFlexibility DockingFeature Extraction using MotionAnalysis Conclusions and Future Work

  • MotivationProtein in biological molecules DockingTransform oneself and Combine with other materials

    Prediction of Docking Prediction of resultant functions

  • Existing Docking SimulationPredicted structuresfrom dockingstructureAstructureBDocking simulationPDB*Rigid structures* Protein Data BankFluctuating in living cells Low prediction accuracyDocking simulationConsidering fluctuations

  • Flexibility DockingPredicted structuresfrom dockingstructureAstructureBDocking simulationPDBFlexibility handling Considering fluctuation of proteins in living cellsExtraction of fluctuated structuresConsideration ofstructural fluctuation of proteins

  • Flexibility HandlingFlexibility handlingMDFilteroutputfileRepresentativestructureFiltering Selection of representative structures from similar structuresMolecular dynamic simulation(MD) Simulation of motion of molecules in a polyatomic systemoutputfileoutputfileoutputfileoutputfileRepresentativestructure

  • Filters using RMSDRMSD(Root Mean Square Deviation)Comparison of the similarity of two structures

    Propose two filtering algorithms Maximum RMSD selection filter Below RMSD 1 deletion filterResult Useful for the heat fluctuation conditionRMSD Unification of topology information Lapse of informationFeature extraction focusing on Protein Motion not Structure

  • Capture Protein Motion MDWavelet transformClusteringContinuous wavelet transform: Morlet wavelet Clustering algorithm:Affinity PropagationSelection of representative motionsFeature extractionThe frequency may change momentarily!

  • Target Protein1TIBResidue length: 269MD simulationSoftware: AMBERSimulation run time: 2 nsec Result data files: 200Space coordinates of C atoms

  • Singular Value DecompositionSVD(Singular value decomposition)

    Definition:

    Unitary matrix U: Left-singular vectorsSpatial motionUnitary matrix V: Right-singular vectorsFrequency fluctuationmatrix-size of A: 807199

  • Singular Value DecompositionSVD(Singular value decomposition)

    Definition:

    Unitary matrix U: Left-singular vectorsSpatial motionUnitary matrix V: Right-singular vectorsFrequency fluctuationmatrix-size of A: 807199

  • Verification of ReproducibilitySingular values and principal components

    Left Singular Vectors(Spatial motion)Right Singular Vectors(Frequency fluctuation)

  • ReproducibilityUsing the eight principal components, the motion expressed by 199 componentscan be reproduced !Almost adjusted !

  • Examination (1) Each of singular values

    (2)The first singular valueAccounted for about 30% overExpression of the original motion Possible by the six singular valuesThe first singular value is useful

  • Clustering AnalysisFocus on the first principal componentDefinitionSimilarities and Preference

    Clustering by using the above values

  • Similarities (1)For left singular vectorsDifference of spatial directs Inner products

    Similarity : C

  • Similarities (2)For right singular vectorsDifference between distributions of spectrum Hellinger Distance

    Similarity:

  • Clustering MethodAffinity propagation(AP)Brendan J. Frey and Delbert Dueck Clustering by Passing Messages Between Data Points. Science 315, 972976.2007Obtain Exemplars: cluster centers

    PreferenceLeft singular vectorsAverage of similaritiesRight singular vectorsminimum of similaritiesmaximum of similaritiesminimum

  • Similarities between Left Singular Vectors

  • Clusteringof Left Singular Vectors

  • Similarities between Right Singular Vectors

  • Clustering of Right Singular Vectors

  • DiscussionsEach of motionsSpatial motionRepetition of several similar spatial motions in time variationFrequency fluctuationRepetition of similar frequency patterns in time variation Relationship Characteristic Frequency fluctuation Group transition on spatial motion

  • Conclusions and Future WorkFlexibility dockingFlexibility handling: MD and FilterFeature extraction based motionWavelet analysisAnalysis of motions ClusteringFuture workCollective motionRelationship Perform the docking simulation

  • Conclusions and Future WorkFlexibility dockingFlexibility handling: MD and FilterFeature extraction based motionWavelet analysisAnalysis of motions ClusteringFuture workCollective motionRelationship Perform the docking simulation