Basic terms:

Basic terms: • Similarity - measurable quantity. • Similarity- applied to proteins using concept of conservative substitutions • Identity • percentage • Homology-specific term indicating relationship by evolution

Basic terms: • Orthologs: homologous sequences found in two or more species, that have the same function (i.e. alpha- hemoglobin).

Basic terms: • Orthologs: homologous sequences found it two or more species, that have the same function (i.e. alpha- hemoglobin). • Paralogs: homologous sequences found in the same species that arose by gene duplication. ( alpha and beta hemoglobin).

Pairwise comparison • Dotplot • All against all comparison. • Every position is compared with every other position.

Pairwise comparison • Dotplot • All against all comparison. • Every position is compared with every other position. • Nucleic acids and proteins have polarity.

Pairwise comparison • Dotplot • All against all comparison. • Every position is compared with every other position. • Nucleic acids and proteins have polarity. • Typically only one direction makes biological sense.

Pairwise comparison • Dotplot • All against all comparison. • Every position is compared with every other position. • Nucleic acids and proteins have polarity. • Typically only one direction makes biological sense. • 5’ to 3’ or amino terminus to carboxyl terminus.

Simple plot • Window: size of sequence block used for comparison. In previous example: • window = 1 • Stringency = Number of matches required to score positive. In previous example: • stringency = 1 (required exact match)

DotPlot WINDOW = 4; STRINGENCY = 2 GATCGTACCATGGAATCGTCCAGATCA GATC + (4/4) GATC - (0/4) GATC - (0/4) GATC + (2/4)

Dot Plot • Compare two sequences in every register. • Vary size of window and stringency depending upon sequences being compared. • For nucleotide sequences typically start with window = 21; stringency = 14 • Protein - start with smaller window : 3, stringency 1 or 2. • Important to test different stringencies.

Intergenic comparison • Nucleotide sequence contains three domains. • 50 - 350 - Strong conservation • Indel places comparison out of register • 450 - 1300 - Slightly weaker conservation • 1300 - 2400 - Strong conservation

Scoring Alignments • Quality Score: • Score x for match, -y for mismatch;

Scoring Alignments • Quality Score: • Score x for match, -y for mismatch; • Penalty for: • Creating Gap • Extending a gap

Scoring Alignments • Quality Score: • Quality = [10(match)]

Scoring Alignments • Quality Score: • Quality = [10(match)] + [-1(mismatch)]

Scoring Alignments • Quality Score: • Quality = [10(match)] + [-1(mismatch)] - [(Gap Creation Penalty)(#of Gaps)

Scoring Alignments • Quality Score: • Quality = [10(match)] + [-1(mismatch)] - [(Gap Creation Penalty)(#of Gaps) +(Gap Ext. Pen.)(Total length of Gaps)] Scoring scheme incorporates an evolutionary model--

Scoring Alignments • Quality Score: • Quality = [10(match)] + [-1(mismatch)] - [(Gap Creation Penalty)(#of Gaps) +(Gap Ext. Pen.)(Total length of Gaps)] Scoring scheme incorporates an evolutionary model-- Matches are conserved

Scoring Alignments • Quality Score: • Quality = [10(match)] + [-1(mismatch)] - [(Gap Creation Penalty)(#of Gaps) +(Gap Ext. Pen.)(Total length of Gaps)] Scoring scheme incorporates an evolutionary model-- Matches are conserved Mismatches are divergences

Scoring Alignments • Quality Score: • Quality = [10(match)] + [-1(mismatch)] - [(Gap Creation Penalty)(#of Gaps) +(Gap Ext. Pen.)(Total length of Gaps)] Scoring scheme incorporates an evolutionary model-- Matches are conserved Mismatches are divergences Gaps are more likely to disrupt function, hence greater penalty than mismatch.

Scoring Alignments • Quality Score: • Quality = [10(match)] + [-1(mismatch)] - [(Gap Creation Penalty)(#of Gaps) +(Gap Ext. Pen.)(Total length of Gaps)] Scoring scheme incorporates an evolutionary model-- Matches are conserved Mismatches are divergences Gaps are more likely to disrupt function, hence greater penalty than mismatch. Introduction of a gap (indel) penalized more than extension of a gap.

Z Score (standardized score) • Z = (Scorealignment - Average Scorerandom) Standard Deviationrandom

Quality Score:Randomization • Program takes sequence and randomizes it X times (user select). • Determines average quality score and standard deviation with randomized sequences • Compare randomized scores with Quality score to help determine if alignment is potentially significant.

Randomization • It has become clear that • Sequences appear to evolve in a “word” like fashion. • 26 letters of the alphabet--combined to make words. • Words actually communicate information. • Randomization should actually occur at the level of strings of nucleotides (2-4).

Global Alignment • Global - Compares all possible alignments of two sequences and presents the one with the greatest number of matches and the fewest gaps.

Global Alignment • Global - Compares all possible alignments of two sequences and presents the one with the greatest number of matches and the fewest gaps. • Alignment will “run” from one end of the longest sequence, to the other end.

Global Alignment • Global - Compares all possible alignments of two sequences and presents the one with the greatest number of matches and the fewest gaps. • Alignment will “run” from one end of the longest sequence, to the other end. • Best for closely related sequences.

Global Alignment • Global - Compares all possible alignments of two sequences and presents the one with the greatest number of matches and the fewest gaps. • Alignment will “run” from one end of the longest sequence, to the other end. • Best for closely related sequences. • Can miss short regions of strongly conserved sequence.

Local Alignment • Identifies segments of alignment with the highest possible score.

Local Alignment • Identifies segments of alignment with the highest possible score. • Align sequences, extends aligned regions in both directions until score falls to zero.

Local Alignment • Identifies segments of alignment with the highest possible score. • Align sequences, extends aligned regions in both directions until score falls to zero. • Best for comparing sequences whose relationship is unknown.

Global Alignment: Local Alignment:

Blast 2 Basic Local Alignment Search Tool E (expect) value: number of hits expected by random chance in a database of same size. Larger numerical value = lower significance HIV sequence

Both Global and Local alignment programs will (almost) always give a match.

Both Global and Local alignment programs will (almost) always give a match. • It is important to determine if the match is biologically relevant.

Both Global and Local alignment programs will (almost) always give a match. • It is important to determine if the match is biologically relevant. • Not necessarily relevant: Low complexity regions. • Sequence repeats (glutamine runs)

Both Global and Local alignment programs will (almost) always give a match. • It is important to determine if the match is biologically relevant. • Not necessarily relevant: Low complexity regions. • Sequence repeats (glutamine runs) • Transmembrane regions (high in hydrophobes)

Both Global and Local alignment programs will (almost) always give a match. • It is important to determine if the match is biologically relevant. • Not necessarily relevant: Low complexity regions. • Sequence repeats (glutamine runs) • Transmembrane regions (high in hydrophobes) • If working with coding regions, you are typically better off comparing proteinsequences. Greater information content.

Basic terms:

Basic terms:

Presentation Transcript

CIRCLES

APES Chapter 1

BASICS OF EARNED VALUE MANAGEMENT

Introduction to Textile Dyeing

Chapter 5 Trade Terms and Price

Blueprint Reading for the Machine Trades, Sixth Edition Unit 1:Dictionary of Terms

Lesson Five

Purpose

Introduction to Physics

Chap 5

Introduction to job analysis

Lesson Five

Ch 3: Understanding Basic Network Security

Physical Chemistry 1

Basic Referee Course

Basic Referee Course

Bangsamoro Basic Law

PLANT REPRODUCTION

VEX/ROBOTC Session 1

GOVT 2305

Chapter 9 Circles

Newborn