Finding the Fault in Nick's Genome Nick at 4-years age loved Batman. He was also dying of a mysterious and painful disease. See how geneticists saved his life by sequencing all of his protein-coding DNA to find the 1-in-billions fault. Learn about our genome, human genetic variation and how mutations can have very different effects. Click “Nick” to view video at Journal-Sentinel site
The case history Nick was born October 2004, the third child in the family. Before his 2nd birthday, an abscess formed near his rectum. Over the next 3 years holes appeared in his colon and large intestine, and stool leaked into his abdomen. The symptoms resembled irritable bowel disease (IBD) or Crohn's disease, but medical, surgical, and diet treatments all failed to stop his illness. By fall of 2009, Nick had spent more than 300 days (250 consecutive) at the hospital, and had suffered through 100 operations. His doctors were baffled, out of clues, and desperate.
Can sequencing Nick's DNA solve this mystery? Nick's doctor and other specialists had already tested Nick for a number of genetic mutations that could cause the observed symptoms, and found nothing. Now they wondered if they could sequence Nick's entire genome, quickly enough to save him, at a reasonable cost?
The human genome project took 13 years and $3 billion to complete a draft of the first human genome in 2003. However, recent advances in DNA sequencing technology has dramatically lowered the cost. In this chart from the NHGRI, Moore's Law is the observation that computing power doubles every two years. What was the cost of sequencing a human genome in 2009? a) $100 million b) $10 million c) $1 million d) $100,000 e) $10,000
Nick's doctors decided to sequence Nick's exome, consisting of all the exons that are expressed as mRNA. The diagram below shows the structure of eukaryotic genes and how introns are removed during nuclear RNA processing Illustration by Jung Choi, April 2015, CC-BY
Recall from the human genome video or your readings: what percentage of Nick's genome would be sequenced by exome sequencing? A) 80% B) 50% C) 30% D) 10% E) 2%
Although exome-sequencing would save time and money, Nick's doctors knew they would miss any mutations in non protein-coding DNA. Mutations in which non-exomic regions could cause severely reduced amounts of a normal protein to be made? a) a mutation in an intron b) a mutation close to the transcription start site c) a mutation in an exon d) a mutation in the DNA after the stop codon In groups with your neighbors, discuss how each of these mutations could affect gene expression, or cause disease.
Ethics of genome sequencing: small group discussion What questions and concerns would Nick's parents have? Nick has two older sisters. What stake do they have in Nick's DNA sequence information? Consider that exome sequencing will reveal information about all of his protein-coding genes, not just the genetic basis of his disease. Information about other genes are sometimes called “incidental findings.”
In your opinion, what “incidental findings” in Nick's genome should be notified to the family? For each mutation listed, indicate “yes” or “no” to informing the family. 1) a mutation in the BRCA1 gene, known to be associated with early breast cancer a) yes b) no 2) a novel mutation in the BRCA1 gene, of unknown significance 3) a mutation in the ApoE gene, associated with early-onset Alzheimer's 4) a mutation associated with high blood pressure and early coronary disease, treatable with medication and monitoring.
Nick's exome sequence was compared to the human reference genome sequence, to identify differences, called "variants". If variants are randomly distributed throughout the human genome, how many variants would be expected in Nick's exome sequence? Assume that the exome is 2% of the genome, and that humans are 99.9% identical in DNA sequence. Recall from the video or the readings the size of the human genome. A) 15,000 B) 60,000 C) 120,000 D) 600,000 E) 3,000,000
Liz Worthey identified 15,272 variants in Nick's exome sequence (Worthey et al. 2011, Genomics in Medicine 13, 255–262), a typical number of variants for exome sequences from healthy people. Compare this number with the number expected based on human DNA being 99.9% identical. Which of the statements below are consistent with this information? A) Mutations in protein coding sequences are more likely to be eliminated by natural selection than mutations in the rest of the genome. B) Mutations occur at a higher rate in non-coding sequences than in protein-coding sequences. C) Mutations in exons are more likely to be corrected by DNA repair enzymes than mutations in introns or intergenic regions. D) Mutations arise and persist in the human population at equal rates in non-coding and protein-coding DNA.
Given over 15,000 variants in Nick's exome sequence, how can we determine which variant is causing his disease, if any? Liz Worthey categorized each variant and devised a scheme to filter them according to their likely impact on the protein function.
Single base changes can have very different consequences With your neighbors, discuss the consequences of the following mutations in a protein-coding sequence: 1) TCA codon TCG 2) TCA codon TGA 3) an insertion or deletion of a single nucleotide 4) an amino acid change in the active site of an enzyme 5) a change of one nonpolar amino acid to another, in a transmembrane domain
Which of these criteria would be the least useful in identifying the mutation (variant) responsible for Nick's rare disease? • variants that are rare in the human population • variants that create in-frame stop codons • variants that create frameshifts • variants that affect both copies of autosomal genes • variants in genes that are known to cause common human diseases
Summary of protein-coding variants Adapted from Worthey et al., Table 1A Liz Worthey looked for novel variants that crippled both copies of the gene. She found 2 genes where both copies had early stop codons. But they were in genes where stop codons are known to occur in healthy people.
Hypothesize that variant is recessive, focus on damaging mutations *hemizygous = having only one copy of the gene, as in X-linked genes in males Adapted from Worthey et al., Table 1C Worthey et al. narrowed the candidate mutations to just one, a single nucleotide substitution in the XIAP gene, located on the X chromosome.
Mutation in the XIAP gene The XIAP mutation identified from exome sequencing was verified using traditional targeted gene sequencing. The top is a sequencing trace from a healthy control. The second is from Nick, and the bottom is from Nick's mother. Nick's mother is heterozygous for the mutation; one copy of her X chromosome has a normal G, but the other has an A. Nick inherited the A allele from his mother.
In Nick's XIAP gene, a TGT codon is changed to a TAT codon. What is the amino acid change in the XIAP protein? A) T (Thr) to I (Ile) B) C (Cys) to Y (Tyr) C) W (Trp) to Y (Tyr) D) T (Thr) to A (Ala) E) No change Notes: TGT is on the coding strand of DNA, which has the same sequence as the RNA; just substitute U for T. The answer choices show both the single-letter code and the three-letter abbreviation for amino acids.
What does the XIAP gene do? It regulates programmed cell death (apoptosis) and the gut immune system. Just as Nick's doctors discovered his XIAP mutation, a new paper reported another mutation in this gene that causes an extremely rare disease called XLP, inability to fight Epstein-Barr virus, and death by age 10. The only cure is a bone marrow transplant. Nick's copy of gene results in a single amino acid change. Why is this particular change so harmful?
Nick’s mutation changes a highly conserved amino acid Alignment of XIAP amino acid sequences from different species, from Worthey et al. 2011. Nick's XIAP sequence is the second row from the top (Var_XIAP); the purple arrow denotes the location of the amino acid change. That species from fruit flies to people all have a cysteine (“C”) at this position indicates that this amino acid is critical. Nick has a tyrosine (“Y”) instead of cysteine, making his protein non-functional. As a result, his intestinal immune system overreacts and causes cell death in his intestinal epithelial cells.
Nick's XIAP mutation is recessive. Nick inherited it from his mother, who is a carrier with no symptoms. Nick's older sisters now know they may also have inherited the same mutation. What is the probability that Nick’s sister is a carrier of the XIAP mutation? A) 0 B) 1/4 C) ½ D) ¾ E) 1
With the diagnosis of Nick's XIAP mutation, Nick received a bone marrow transplant. After surviving a few harrowing months of recovery, Nick is now free of his symptoms and enjoying eating steak and pizza and everything else healthy kids like. Nick was the first to have a mystery disease diagnosed by genome sequencing. Since 2009, other genome sequencing centers have started genome sequencing to diagnose mystery illnesses. They are able to identify causal mutations in about 50% of their patients.
Human genetic variation: what to expect if you have your own genome sequenced Results from whole genome and whole exome sequencing of healthy people from all over the world (MacArthur et al. 2012) tell us: • Healthy people have millions of differences in their DNA sequence. • The vast majority of these variants have no phenotypic effect. • Most (but not all) variants with phenotypic effects will be in the exome (protein-coding DNA). • Each person has about 100 variants of unknown significance, that may damage or alter the function of the gene. • Further study is required to determine the effects of these rare variants of unknown significance.
Bibliography and sources: Mark Johnson and Kathleen Gallagher. One in a Billion: A boy's life, a medical mystery. Milwaukee-Wisconsin Journal Sentinel, series published starting Dec 18, 2010. Intro to series and video introduction here: http://www.jsonline.com/news/health/111224104.html MacArthur, DG et al. 2012. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335: 823-828 DOI: 10.1126/science.1215040 http://www.sciencemag.org/content/335/6070/823.full Worthey et al. 2011. Making a definitive diagnosis: Successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genetics in Medicine 13, 255–262. http://www.nature.com/gim/journal/v13/n3/full/gim9201146a.html