1 / 13

Bioinformatics

29 th Feb, 2012. Bioinformatics. Ayesha Masrur Khan. Protein Family and Domains. Once a protein sequence is obtained, there are many questions that can be asked, such as -what is the protein’s overall identity? -what putative functions does it have? -what biological motifs are present?

garth
Télécharger la présentation

Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 29th Feb, 2012 Bioinformatics Ayesha Masrur Khan

  2. Protein Family and Domains Once a protein sequence is obtained, there are many questions that can be asked, such as -what is the protein’s overall identity? -what putative functions does it have? -what biological motifs are present?  Different computational tools are needed to determine possible functional domains based on primary sequence data. Lec-5

  3. Protein Family and Domains (contd.) • Therefore, family and domain databases are used to address the question- ‘what domains are contained within this sequence?’ or ‘what family does this protein belong to?’ BUT first: what are families and domains? Lec-5

  4. Protein Family and Domains (contd.) Family---> A family of proteins was originally defined by Dayhoffet.al (1978) as a group of sequences with more than 50% identity when aligned with similar functions. Families are often also characterized by the presence of one or more domains with high sequence similarity. Domains---> Traditionally known as structurally independent folding units, are conserved functional units that may contain one or more motifs. Lec-5

  5. Protein Family and Domains (contd.) Motifs---> These include both short stretches of fixed residue length that act as sites for post translational modifications and longer sequences that form secondary structures for protein-DNA, protein-ion or protein-lipid interactions. Lec-5

  6. Domain Example: Pyruvatekinase Quaternary structure: 4 subunits 3 domains Lec-5

  7. Zinc finger motif: A sequence motif Sequence motif: A particular amino-acid sequence that is characteristic of a specific biochemical function Three zinc fingers bound spirally in the major groove of a DNA molecule. The coordination of a zinc atom by characteristically spaced cysteineand histidine residues in a single zinc finger motif Lec-5

  8. Other examples: structural motifs Another type is the functional motif, which is a sequence or structural motif that is always associated with a particular biochemical function. Lec-5

  9. Protein families • Protein families are related to one another by sequence similarity, domain composition, or structure. • These include proteins found across species orthologues) or within the same species (paralogs). • Family descriptors are derived from MSAs (multiple sequence alignments) that enable us to define traits that encompass all member sequences. • Family descriptors have been based on sequence identity (>50% identical), common domains (e.g. catalytic binding domains, calcium binding motifs etc.), structure, or a combination of these characteristics. Lec-5

  10. Protein Domains • Domains represent discrete stretches within the protein, unlike protein families, which are commonly defined over the length of the sequence. • These units are conserved at the level of sequence and structure. • They can be described by: • combinations of short regions of highly conserved amino acids within a domain • all amino acids • structural features • Domain description is developed in the same way as the family descriptors. Lec-5

  11. Family-Domain Databases • Because of the reuse of motifs and domains, similarities can be found within sequences that are otherwise unrelated evolutionarily. • Therefore, methods are needed to distinguish between similarities due to random variation and those of common origin or function. Family-domain databases provide the following benefits: Increase sensitivity, i.e. true matches are detected through MSA Increased specificity, i.e. detect only related proteins Classification of protein sequences to appropriate families Lec-5

  12. Family-Domain Databases Some database references Lec-5

  13. Searching sequence databases • Search methods engage in a series of sequence alignments to determine degrees of similarity between sequences and then return a list of matched sequences to the user. • Alignment Algorithms • Manually, we examine two or more sequences for similar residue patterns, match up identical residues, decide qualitatively whether they are aligned well, and determine statistically how identical or similar the sequences are. • The automation of this process requires a computer-based method to line sequences up against one another and a scoring method for evaluating the success of the alignment in terms of similarity or identity. Lec-5

More Related