1 / 28

Hepatitis C

Hepatitis C. Analysis of Sequence Data from ARUP and NCBI databases. By Ian Odell. What Information can we get from ARUP sequencing data?. Data is from January 2002 – July 2004. 5’ Un-translated Region of types 1 – 6: Number of unique sequences by type.

travis
Télécharger la présentation

Hepatitis C

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hepatitis C Analysis of Sequence Data from ARUP and NCBI databases By Ian Odell

  2. What Information can we get from ARUP sequencing data? Data is from January 2002 – July 2004. 5’ Un-translated Region of types 1 – 6: • Number of unique sequences by type. • Frequency of unique sequences for each type. • Frequency of each base in each type seen in a position weight matrix. • Regions of high and low variation seen in graphs of a Position Weight Matrix.

  3. Unique Sequences by type: HCV Total Unique Unambiguous Type Sequences Sequences Unique Sequences 1 16151 1320 750 2 2862 585 373 3 2430 404 232 4 284 99 68 5 7 5 4 6 44 20 17 total 21778 2434 1444 ***Ambiguous bases causes unique sequences to be overrepresented.

  4. Frequency of unique sequences for type 1:

  5. Frequency of unique sequences for type 1:

  6. Frequency of unique sequences for type 2:

  7. Frequency of unique sequences for type 2:

  8. Frequency of unique sequences for type 3:

  9. Frequency of unique sequences for type 3:

  10. Frequency of unique sequences for type 4:

  11. Frequency of unique sequences for type 4:

  12. Frequency of unique sequences for type 5:

  13. Frequency of unique sequences for type 6:

  14. Frequency of unique sequences for type 6:

  15. Conclusions 1. Each type has a ‘profile’ sequence. 2. Do the log v log graphs give us insight into the distribution of mutations within the Hepatitis C population? NEXT: Look for variation between and within types from the unique sequences that are highly represented in the population (i.e. those that have many duplicates). Open Profiles

  16. Stuyver et al. 1996. “Second-generation line probe assay for hepatitis C virus genotyping.” J. Clin. Microbiol. 34:2259-2266. In R5, the six selected probes were used for types 1 (line 4), 3 (line 15), 4 and 10 (line 18), and 5 (line 20), as well as for subtypes 2a/2c (line 11), 2b (line 12), and 3b (line 18).

  17. Weight Matrices • From Profiles, we can see areas of variation between types and their conservation within each type. • Next, we want to see what these look like for all sequences in each type.

  18. Example Weight Matrix This allows us to see the variation within a type at each nucleotide. First 10 base positions of Type 2 HCV

  19. Graphical Type 1 Weight Matrix Sum of all points at each x-value = 1. Y-value tells us percentage each base is found at that index. We are looking for a region of conservation in all types; later we can look for variation between types. [ R5 ] ] [ R5 ] ]

  20. Graphical Type 2 Weight Matrix

  21. Graphical Type 3 Weight Matrix

  22. Graphical Type 4 Weight Matrix

  23. Graphical Type 5 Weight Matrix

  24. Graphical Type 6 Weight Matrix

  25. What information can we get from NCBI data? • Look at Complete HCV Genome publications because blasting 5’ UTR primers biases towards what those primers amplify (i.e. Blast returns most similar hits and we want to look for variation). • Are there mismatches under the ARUP primers?  Do ARUP primers bias the sequence data by not amplifying a certain group? • Regions of low and high variation in the complete genome. Compare to 5’ UTR.  alignment not good enough for an accurate analysis.

  26. Graphical Weight Matrix of ARUP (5’ UTR) Amplicon Data is from 239 aligned complete HCV genomes downloaded from GenBank. [ Rev Primer ] [ For Primer ]] [ For Primer ]] [ Rev Primer ]

  27. SNP’s and insertions under ARUP Forward Primer Graphical Weight MatrixARUP forward primer region in Blast complete genome alignment 2 Ins 7 1 5 3 1 SNP’s / 239 Sequences

  28. SNP’s and insertions under ARUP Reverse Primer Graphical Weight MatrixARUP reverse primer inBlast complete genome alignment 3 SNP’s / 239 Sequences

More Related