1 / 107

MSc IAMZ - UAB - UPV 2007 - 2008

MSc IAMZ - UAB - UPV 2007 - 2008. Essential Bioinformatics for Animal Breeders Miguel Pérez-Enciso miguel.perez @ uab.es www.icrea.es. Outline. Why bioinformatics Complex and simple traits: Why statistics Quantitative Trait Locus (QTL) detection Microarrays. Ultimate goal.

pier
Télécharger la présentation

MSc IAMZ - UAB - UPV 2007 - 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MSc IAMZ - UAB - UPV 2007 - 2008 Essential Bioinformatics for Animal Breeders Miguel Pérez-Enciso miguel.perez @ uab.es www.icrea.es

  2. Outline • Why bioinformatics • Complex and simple traits: Why statistics • Quantitative Trait Locus (QTL) detection • Microarrays

  3. Ultimate goal Determine the genetic basis of 'complex' traits (there are many other applications of DNA markers)

  4. Ultimate goal Determine the genetic basis of 'complex' traits conservation traceability marker assisted selection ...

  5. Genetics and animal breeding has become a data rich science, where the limiting step already NOW is the data analysis, rather than in the obtention of the data themselves. Three main streams of data: DNA Sequence DNA polymorphism (markers) Expression data (functional genomics)

  6. Trait classification • 'Simple' or Mendelian traits • 'Complex' or quantitative traits This classification is rather artificial, there is a continuum rather than a clearcut classification

  7. 'Simple' traits in Animal Genetics • Double muscling in cattle • Halothane gene in pigs • RN mutation in pigs • Feather pecking in chicken • Most color mutations

  8. Double muscling in cattle Belgian blue Asturiana

  9. Performance Asturiana Valles

  10. Grobet et al. (1997) Nat Genet 17:71 • Results in an increase in the number of muscle fibres (hyperplasia), and fibre enlargement (hypertrophy), lower fat and collagen. • Caused by a stop codon in myostatin. • Myostatin is a member of the transforming growth factor (TGF)-β superfamily actively represses skeletal muscle growth. • ~ 10 mutations that cause disruption of the gene, different breeds have different mutations.

  11. Halothane gene in pigs • Pigs sensitive to halothane and to stress. • Higher percentage of lean. • Higher growth. • Higher mortality. • Lower meat quality (loss of water).

  12. Fujii et al. (1991) Science 253:448 • Non synonimous mutation in Ryanodine Receptor 1 (Ryr1). • It changed a key aminoacid Arg -> Cys. • Ca release gate membrane protein.

  13. What are 'complex' (quantitative) traits? Sensitive to the environment Affected by several genes Traits showing a continuous distribution

  14. Most traits of interest are 'complex' • Milk and meat production • Litter size • Disease resistance • ...

  15. What are the consequences of complexity? In a 'simple' trait the phenotype is predominantly determined by the genotype. In a complex trait ... Uncertainty!

  16. Uncertainty  Ignorance ! Uncertainty  Error WE REQUIRE STATISTICS

  17. EXAMPLE Suppose • A trait is normally distributed. • There are two alleles at a given locus. • An additive mutation that increases growth.

  18. A 'simple trait' p(y|g=qq) p(y|G=Qq) p(y|G=QQ)

  19. A complex trait p(y|G=QQ) p(y|G=Qq) p(y|g=qq)

  20. Mixture visualization p(y|G=QQ) p(y) p(y|G=Qq) p(y|g=qq)

  21. EXAMPLE • What is the expected genotype of an individual whose phenotype is the mean? • What is the expected genotype of an individual whose phenotype is 1 SD above the mean?

  22. Statistics • Science to deal with uncertainty. • In Animal Breeding, it provides the link between molecular genetics and the applied world, between genotype and phenotype. • A very important area of research now, the limit lies more in analyzing data than in obtaining the data (Bioinformatics).

  23. Two key aspects • Description • Inference or Prediction

  24. Inference: the concept of model • Simplification of reality • Links data to some abstract useful concept • A model is not supposed to be 'TRUE' • A model is meant to be USEFUL

  25. Desirable characteristics of models • Adjust to data • Parsimonious (austere) • Interpretable

  26. Examples of models in Genetics Milk_Production = Herd + Genetic_Effect + Error Meat_Quality = Age + Sex + Halothane_Genotype + Error Growth = Sex + Breed + Myostatin_Genotype + Error

  27. Usual Steps • Define a model • Estimate parameters • Carry out significance test

  28. Usual Approach in Genetics • Define a model • Carry out a scan across marker or genome positions • Estimate parameters • Carry out significance test called a Quantitative Trait Locus (QTL) analysis

  29. Quantitative trait locus experiments • Principles • Crosses between inbred lines • Outbred line

  30. GENE  LOCUS: a stretch of DNA whose variants (alleles) produce a change in a trait. QTL: in principle, a gene whose polymorphism affects a quantitative trait; in practice, a huge genome sequence statistically associated with the trait. MARKER (M): a ‘neutral’ polymorphism. Microsatellites, SNPs and AFLPs are markers. GENOTYPE (G): The paternal and maternal alleles define the genotype at that locus. PHENOTYPE (y): the observed characteristic (trait) of each individual. HAPLOTYPE (H): the paternal (or maternal) set of alleles of each individual. PHASE : Two alleles are in the same phase if they were inherited from the same parent. They are in cis; otherwise, they are in trans.

  31. Sources of information y: phenotypes M: markers P: pedigree

  32. The model locus effects (QTL) fixed effects infinitesimal genetic effect residual phenotype

  33. Usual approaches • Simple experimental design: • F2 • BC • Isolated families • Simplify genetic model: • One single locus • Alternative alleles in each parental lines

  34. Statistical Techniques • Regression (Least squares) • Maximum likelihood • Bayesian methods • Non parametric methods

  35. Usual genetic decomposition (biallelic gene): Genotype QQ Qq qq Genetic value a d -a or a = [ E(y | G=QQ) - E(y | G=qq) ] / 2 d = E(y | G=Qq)

  36. Crosses between inbred lines M m M m m M M Q Q Q q q Q q q m P x m M m m q q Q q x F1 BC

  37. F2 cross scheme x aa,bb AA,BB x Aa,Bb Aa,Bb AA,BB Aa,BB aa,BB AA,Bb Aa,Bb aa,bb AA,bb Aa,bb

  38. r and QTL effects are confounded in a single marker analysis Genotype Freq. E(y|G) MQ (1-r)/2 a Mq r/2 -a mQ r/2 a mq (1-r)/2 -a E(y|M) = a (1-r) - a r E(y|m) = a r - a (1-r) D = E(y|M) - E(y|m) = 2 a (1-2r) Var(D) = 2 [s2 + 4 a2 r (1-r)] / n

  39. Interval mapping By using intervals delimited by two markers : - we can distinguish between r and a (and d) - we use more information, and we reduce error Regression approach: Haley and Knott (1992) Maximum likelihood: Lander and Botstein (1990)

  40. Interval mapping M M m M m m q ? ? ? ? Q N m n N N n n n n n n n m m m m m m q q q q q q a = E(y|Qq) -a = E(y|qq) x Backcross r = recomb fraction between markers M and N(known) r1 (r2) = recomb fraction between marker 1 (2) and QTL r = r1 / r r1, a: unknown

  41. Interval mapping m M q Q N n n n m m q q a = E(y|Qq) -a = E(y|qq) x Backcross Genotype Freq. P(G=Q|M) P(G=q|M) E(y|M) MN (1-r)/2 1 0 a Mn r/2 r2/r r1/r a (1-r) - a r mN r/2 r1/r r2/r ar - a (1-r) mn (1-r)/2 0 1 -a r = recomb fraction between markers(known) r1 (r2) = recomb fraction between marker 1 (2) and QTL r = r1 / r r1, a: unknown

  42. Interval mapping : BC regression approach Haley and Knott (1992) P(G=QQ|M) - P(G=Qq|M) The model is : y = b + ca a + e QTL effect phenotypes fixed effects

  43. Interval mapping : BC regression approach The strategy is : 1) compute ca at predetermined positions 2) Compute the test statistics F full model / reduced model at each position 3) Choose estimates (r and a) that correspond to Fmax The reduced model is : y = b + e Haley and Knott (1992) The model is : y = b + ca a + e

  44. Example : F2 cross between Iberian and Landrace pigs The IBMAP consortium (Spain) UdL-IRTA, INIA, UAB, CTC-IRTA, UMurcia

  45. The Landrace line

  46. The F1 offspring

  47. Some F1s ...

  48. The variability in the F2

  49. 31 3 IBMAP experimental protocol F0 x Ibérico Guadyerbas Landrace Nova Genètica 1 litter F1 71 6 x 100 markers F2 577

  50. Traits measured Carcass: Weight Backfat thickness Carcass length Cutting weights Histochemistry: * % muscle fibers * diameter fibers Quality: pH 45’ y 24h Conductivity Pigments Minolta color % intram. fat % Fatty acids

More Related