140 likes | 280 Vues
Identifying property based sequence motifs in protein families and superfamies: application to DNase-1 related endonucleases. Venkatarajan S. Mathura et al. Presented by Mr. Hat. Motivation.
E N D
Identifying property based sequence motifs in protein families and superfamies: application to DNase-1 related endonucleases Venkatarajan S. Mathura et al. Presented by Mr. Hat
Motivation • “Statistically derived matrices based on allowed substitution of amino acids are not designed to detect conservation of physical–chemical properties” • Hmmr, psi/phi blast and rps-blast to name a few • MASIA could compliment these existing gene mining tools
Methods • Created quantitative descriptors E1 – E5 that described amino acid properties and their physical interpretation • Created from a comprehensive list of 237 PCP • They measured conservation by the standard deviation and relative entropy of the values E1 – E5 • Venkatarajan and Braun 2001 • Defined a minimum length cutoff, maximum gap thresh hold
Experiment • Used APE family sequences from 42 organisms • Both prokaryotes and eukaryotes • Used taxonomic classification to remove a bunch of the redundant data • Each motif is represented as a “profile” • Consisting of average values, standard deviation and relative entropies for each vector E1 - E5 • MASIA MOTIF MAKER
Experiment (cont.) • Used these profiles to search ASTRAL40 database
Example score matrix for motif 2 and it’s corresponding E1 – E5 values • * means low relative entropy • + means significant component • - not a significant component
Results • MASIA tool found all DNase-like superfamily members in ASTRAL40 • But this doesn’t show specificity?? • PSI-Blast --default parameters • Used all 42 sequences to seed psi blast • Performed local and NCBI psi-blast • Searched “non-redundant sequence database” –NR/NT??? • Found no DNase-I or IPP sequences after several iterations
Results (cont.) • PSI-blast (cont.) • Evalue was increased to .1 • DNase-I was found after four iterations, but it also brought in 500 other junk sequences • Failed to find DNase-I in the ASTRAL40 database
How bout them PCP motifs and MASIA! This could possible improve my gene hunting capabilities! Now if I just had fingers to type! By the way, where are those fat BBS mice, I’m getting hungry!