440 likes | 564 Vues
Aligning Kinases. Applying MSA Analysis to the CDK family. Building A Multiple Sequence Alignment. Potential Uses of A Multiple Sequence Alignment ?. chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE
E N D
Aligning Kinases Applying MSA Analysis to the CDK family
Potential Uses of A Multiple Sequence Alignment? chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP mouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: * chite AATAKQNYIRALQEYERNGG- wheat ANKLKGEYNKAIAAYNKGESA trybr AEKDKERYKREM--------- mouse AKDDRIRYDNEMKSWEEQMAE * : .* . : Extrapolation Phylogeny Multiple Alignments Are CENTRAL to MOST Bioinformatics Techniques. Motifs/Patterns Struc. Prediction Profiles
1 Organizing a Family Gathering The CDK example
Choosing the Right Sequences • SwisProt • Litterature • Other Databases
Organizing the Data PublicData Automatic SRS IGSData CDK Genecard Manual Aventis
Accessing the Data: The Fischer Server • Fischer will Contain • A collection of Flat files • A secure SRS server • File Formats • The server is a Technology Pipeline • Can be adapted in real time • Can be Transfered
Our CDK Data • CDKs and CDK-like • Protein Information • Functional Features • Structural Information • Genomic Information • Genes • Variant • SNPs
Our MSA dataset • 29 amino acid sequences (CDKS and Aurora families, stemming from primary transcripts) • 2 isoforms of a cdk member • 4 PDB structures : • 1MUO (AUR A) • 1BLX (CDK 6 ) • 1b38 (CDK 2) • 1H4L (CDK 5) • Use of T-coffee release 1.78 with integration of the structure informations contained in pdb files
2 Aligning The Sequences
Building A Multiple Sequence Alignment • ClustalW • T-Coffee • Muscle • Hand Editing • Combination • Comparison
Using Structural Information3D-Coffee Seq Vs Seq LocalGlobal Seq Vs Struct Struct Vs Struct Thread Superpose
Accessing the Methods:Fischer • Public 3D-Coffee server • igs-server.cnrs-mrs.fr/TCoffee/ • Fischer • Latest version of T-Coffee • Customised parameters • Coktails of MSA methods
Feature Dressing -25 Binding site -20 Phospho -40 nsSNP -50 Splice Site … … … … Escript
T-Coffee CORE Evaluation Specificity () and Sensitivity () CORE index
ATP binding site ATP binding site ATP binding site Glycine loop Glycine loop Non-synonymous SNP Features mapping on multiple alignment T-coffee ClustalW
Structure Based EvaluationAPDB • Include Sequences with Known Structures • Do Not use Structural Information Score 1 • Use Structural Information: Score 2 • If Score1 ~ Score 2 • Structural Information does not help much • The alignment is of reasonnable quality
Evaluating a Multiple Sequence Alignment • T-Coffee CORE index • Feature Based Library • APDB
Maninupulating and Comparing Alignments • Reformating/Processing • seq_reformat • extract_from_pdb • Coloring • seq_reformat • ESCript • Comparing • aln_compare
5 Thinking Large ????
T-Coffee_dpa • T-Coffee is limited to a small number of sequences • T-coffee_dpa: Double Progressive Algo • Able to handle large datasets • 1000 sequences and more • Able to use structural information
1 Exploring The Alignment
Cdk's T-loop (orange) and aurora's Activating loop Cdk's signature Substrat recognition motif Exploring The Alignment
2 Using The Alignment Does my Sequence Make Sense
Insertion within the NucBinding Site… Identifying Abnormalities within an MSA
Identifying Abnormalities within an MSA Activation loop (orange)
Identifying Abnormalities within an MSA Retinoblastoma
2Using The AlignmentAnalysing the Structure withThe Alignment
3 Using The Alignment Spotting differences
4 Clustering and Correlating
Function Trees Vs Lead Trees • 1-Select Functionnaly Important Positions • 2-Make a tree based on these positions • 3-Compare the tree with the lead tree • PROBLEMS: • Choose on the right positions • Describe the Leads with the right determinants