1. Protein Clustering to Assemble Families of Homeomorphic Proteins Chris Elsik
2. Outline Using evolution to infer protein function
The problem of automatic assembling of homeomorphic (identical domain organization) protein families
Delineating protein domain boundaries
3. Evolution Allows us to Infer Function The most powerful method for inferring function of a gene or protein is by similarity searching a sequence database.
Our ability to characterize biological properties of a protein using sequence data alone stems from properties conserved through evolutionary time.
Homologous (evolutionarily related) proteins always share a common 3-dimensional folding structure.
They often contain common active sites or binding domains.
They frequently share common functions.
Predictions made using similar, but non-homologous proteins are much less reliable.
4. Orthologs Homologs = genes that are evolutionarily related
There are two kinds of homologs:
Orthologs = genes in different species that have diverged from a common gene in an ancestral species.
Paralogs = genes that have diverged due to gene duplication.
Orthologs are more likely than paralogs to have conserved function.
Orthologs cannot be identified using BLAST or FASTA sequence comparison alone.
Reliable ortholog identification requires phylogenetic methods.