1 / 45

Copy number variation (CNV) What is it?

CZ5225: Modeling and Simulation in Biology Lecture 10: Copy Number Variations Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg http://bidd.nus.edu.sg Room 08-14, level 8, S16, NUS. Copy number variation (CNV) What is it?.

cid
Télécharger la présentation

Copy number variation (CNV) What is it?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CZ5225: Modeling and Simulation in BiologyLecture 10: Copy Number VariationsProf. Chen Yu ZongTel: 6516-6877Email: phacyz@nus.edu.sghttp://bidd.nus.edu.sgRoom 08-14, level 8, S16, NUS

  2. Copy number variation (CNV)What is it? • A form of human genetic variation: instead of 2 copies of each region of each chromosome (diploid), some people have amplifications or losses (> 1kb) in different regions • this doesn’t include translocations or inversions • We all have such regions • the publicly available genome NA15510 has between 5 & 240 by various estimates • they are only rarely harmful (but rare things do happen)

  3. * * * * * * * * * CN=1 CN=2 CN=3 PM = c PM = 2c PM = 3c Copy-number probes are used to quantify the amount of DNA at known loci CN locus:...CGTAGCCATCGGTAAGTACTCAATGATAG... PM: ATCGGTAGCCATTCATGAGTTACTA

  4. Copy number variationPopulation genomics The genomes of two humans differ more in a structural sense than at the nucleotide level; a recent paper estimates that on average two of us differ by ~ 4 - 24 Mb of genetic due to Copy Number Variation ~ 2.5 Mb due to Single Nucleotide Polymorphisms

  5. Abundance of CNVs in the human population ? Still an open question but probably thousands, at low allelic frequency (<20%)

  6. Abundance of deletion CNVs in the human population Comparison of overlapping CNVs identified by Conrad et al. (2006) and McCarroll et al. (2006). Freeman et al. Genome Res 2006

  7. Non-allelic homologous recombination events between low-copy repeats (LCR-NAHR) Lupski & Inoue, TIG 2002

  8. Duplications and Deletions of LCRs mediated by NAHR LCRs in direct orientation LCRs in inverted orientation Inversions

  9. Intrachromatid recombination between LCRs LCRs in direct orientation LCRs in inverted orientation Inversion Deletion

  10. Mechanisms generating genomic deletions

  11. Copy number variationRelations to human disease Responsible for a number of rare genetic conditions. For example, Down syndrome ( trisomy 21),Cri du chatsyndrome (a partial deletion of 5p). Implicated in complex diseases. For example: CCL3L1 CN  HIV/AIDS susceptibility; also, some sporadic (non-inherited) CN variants are strongly associated with autism, while Tumors typically have a lot of chromosomal abnormalities, including recurrent CN changes.

  12. Evolutionary and medical implications of CNVs: CCL3L1 as an example Gonzales et al., Science, 2005 When CCL3L1 occupies the CCR5 receptor on CD4 cells, it blocks HIV's entry.

  13. Copy-number variation of CCL3L1 within and among human and chimp populations Gonzales et al., Science, 2005

  14. CCL3L1 and HIV Infection Individuals with a high CCL3L1 gene copy number relative to their population average are more resistant to HIV infection than those with a low copy number, presumably because there is more ligand to compete with HIV during binding to CCR5. Gonzales et al., Science, 2005

  15. Trisomy 21

  16. Partial deletion of chr 5p

  17. A cytogeneticist’s story “The story is about diagnosis of a 3 month old baby with macrocephaly and some heart problems. The doctors questioned a couple of syndromes which we tested for and found negative. Rather than continue this ‘shot in the dark’ approach, we put the case on an array and found a 2Mb deletion which notably deletes the gene NSD1 on chr 5, mutations in which are known to be cause Sotos syndrome. This is an overgrowth syndrome and fits with the macrocephaly. The bottom line is that we are able to diagnose quicker by this approach and delineate exactly the underlying genetic change.”

  18. A cytogeneticist’s story Chromosome 5 2Mb deletion

  19. Many tumors have gross CN changes A lung cancer cell line vs matched normal lymphoblast, from Nannya et alCancer Res 2005;65:6071-6079

  20. Research into gonad dysfunction: Human sex reversal • 20% of 46,XY females have mutations in SRY • 80% of 46,XY females unexplained! • 90% of 46,XX males due to translocation SRY • 10% of 46,XX males unexplained! Suggests loss of function and gain of function mutations in other genes may cause sex reversal. We’re looking at shared deletions.

  21. SNP A TAGCCATCGGTAGTACTCAATGAT G Affymetrix SNP chip terminology Genomic DNA Perfect Match probe for Allele A ATCGGTAGCCATTCATGAGTTACTA Perfect Match probe for Allele B ATCGGTAGCCATCCATGAGTTACTA Genotyping: answering the question about the two copies of the chromosome on which the SNP is located: Is a sample AA(AA),AB(AG)orBB(GG)at thisSNP?

  22. * * * * * 5 µ 5 µ > 1 million identical 25 bp probes/feature 1.28cm 1.28cm 6.4 million features/chip Affymetrix GeneChip *

  23. Xba Xba Xba PCR: One Primer Amplification Complexity Reduction AA BB AB GeneChip Mapping Assay Overview 250 ng Genomic DNA RE Digestion Adaptor Ligation Fragmentation and Labeling Hyb & Wash

  24. Principal low-level analysis steps • Background adjustment and normalization at probe level These steps are to remove lab/operator/reagent effects • Combining probe level summaries to probe set level summary: best done robustly, on many chips at once This is to remove probe affinity effects and discordant observations (gross errors/non-responding probes, etc) • Possibly further rounds of normalization (probe set level) as lab/cohort/batch/other effects are frequently still visible • Derive the relevant copy-number quantities Finally, quality assessment is an important low-level task.

  25. TT AT AA Preprocessing for total CN using SNP probe pairs (250K chip) Modification by H Bengtsson of a method due to A Wirapati developed some years ago for microsatellite genotyping; similar to the approach used by Illumina.

  26. Background adjustment and normalization Outcome similar to that achieved by quantile normalization

  27. Low-level analysis problems remain unsolved; why? • The feature size keeps  and so the # features/chip keeps; • Fewer and fewer features are used for a given measurement, allowing more measurements to be made using a single chip These considerations all place more and more demands on the low-level analysis: to maintain the quality of existing measurements, and to obtain good new ones.

  28. * * * * * * * * * * * * * * * * * * * * * * * * AA AB AAB PM = PMA+PMB = 2c PM = PMA + PMB = 2c PM = PMA+PMB = 3c BB PM = PMA + PMB = 2c SNP probes can be used toestimate total copy numbers *

  29. CATGAGTTACTA ATCGGTAGCCATT 0 Allele PM A ATCGGTAGCCAT A CATGAGTTACTA MM 0 Allele A ATCGGTAGCCAT C CATGAGTTACTA 0 Allele PM B CATGAGTTACTA ATCGGTAGCCAT G MM 0 Allele B SNP probe tiling strategy SNP 0 position A / G GTACTCAATGAT* TAGCCATCGGTAN Central probe quartet

  30. GTAGCCATT CAT GAGTTACTAGTCG +4 Allele PM A GTAGCCAT T CAT CAGTTACTAGTCG MM +4 Allele A GTAGCCATC CAT GAGTTACTAGTCG +4 Allele PM B GTAGCCATC CAT CAGTTACTAGTCG MM +4 Allele B SNP probe tiling strategy SNP A / G +4Position GTACTCAATGATCAGCT* TAGCCATCGGTAN +4 offset probe quartet

  31. SNP for Identifying Copy Number Variations • Using SNP chips to identify change in total copy number (i.e. CN ≠ 2) • Outline a new method (CRMA) • Evaluate and compare it with other methods • Make some closing remarks on further issues

  32. Copy-number estimation using Robust Multichip Analysis (CRMA) A few details are passed over. Ask me later if you care about them.

  33. * * * * * * * * * * * * * * * * * * * AA BB AB PMA >> PMB PMA << PMB PMA ≈PMB Crosstalk between alleles - adds significant artifacts to signals Cross-hybridization: Allele A: TCGGTAAGTACTC Allele B: TCGGTATGTACTC

  34. There are six possible allele pairs • Nucleotides: {A, C, G, T} • Ordered pairs: • (A,C), (A,G), (A,T), (C,G), (C,T), (G,C) • Because of different nucleotides bind differently, the crosstalk from A to C might be very different from A to T.

  35. BB AB PMB AA + PMA offset Crosstalk between alleles is easy to spot Example: Data from one array Probe pairs (PMA, PMB) for nucleotide pair (A,T)

  36. PMB + PMA no offset Crosstalk between alleles can be estimated and corrected for What is done: Offset is removed from SNPs and CN units. Crosstalk is removed from SNPs. BB AB AA

  37. Copy-number estimation using Robust Multichip Analysis (CRMA) Already briefly described.

  38. Copy-number estimation using Robust Multichip Analysis (CRMA)  That’s it!

  39. Copy-number estimation using Robust Multichip Analysis (CRMA) log2(PMijk) = log2ij + log2jk + ijk Fit using rlm

  40. Copy-number estimation using Robust Multichip Analysis (CRMA) Longer fragments get less well amplified by PCR and so give weaker SNP signals 100K

  41. Copy-number estimation using Robust Multichip Analysis (CRMA) Longer fragments get less well amplified by PCR and so give weaker SNP signals 500K

  42. Copy-number estimation using Robust Multichip Analysis (CRMA) Longer fragments get less well amplified by PCR and so give weaker SNP signals 500K

  43. Copy-number estimation using Robust Multichip Analysis (CRMA) Care required with the number and nature of Reference samples used

  44. Comparison of 4 methods

  45. Further bioinformatic issues • Estimating copy number: needs calibration data • Segmentation (of chromosomes into constant copy number regions): an HMM-like algorithm • Analyzing family CN data: a different HMM • Incorporating non-polymorphic probes: independent HMM observations to be weighted and combined • Dealing with mixed normal-abnormal samples • Utilizing poor quality DNA samples • Estimating allele-specific copy number

More Related