comparative genomics the final e n.
Skip this Video
Loading SlideShow in 5 Seconds..
Comparative Genomics The Final e PowerPoint Presentation
Download Presentation
Comparative Genomics The Final e

Comparative Genomics The Final e

144 Vues Download Presentation
Télécharger la présentation

Comparative Genomics The Final e

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Comparative Genomics The Finale Angela Pena, AmbilySivadas, AmitRupani Shimantika Sharma, Juliette Zerick KeertiSurapaneni, ArtikaNath, HemaNagrajan

  2. Outline Results • Goal 1 – PCR Assay • Goal 2 – Comparative genome analysis • Goal 3 – Haemolysisstudy • Goal 4 – Virulent factors • Discussion

  3. Goal 1 Identification and characterization of target genes for PCR Assay

  4. Identification of target genes Fw primer Rv primer Hhae C A B NTHi A C PCR products of different size One copy of B or multiple copies? Is A-C organized the same way in both organisms? Identify candidate clusters/genes for assay development and conserved regions for primer/probe design

  5. Cluster Analysis – Genome Set

  6. Cluster statistics • Total clusters: 8402 • Total common to all genomes : 361 • Total unique to Hhae: 82 • Total unique to Hinf: 38 • Total unique to Pathogenic strains: 0 • Protein sequences were clustered using Blastclust

  7. Common clusters – Functional breakdown

  8. Target identification - Protocol 1 1. Take all proteins common to all 25 genomes • Common = Most conserved proteins

  9. Target identification - Protocol 1 1. Cluster Analysis: Take all proteins common to all 25 genomes • Common = Most conserved proteins 2. Compute and compare inter-cluster distances for HhaevsHinf • Look for species specific patterns • Look for including unique genes

  10. Protocol II:BLAST Everything If a pistol just isn't working for you . . . • Our method is this: for every unique Hhae gene, we will locate its corresponding contig • We checked the flanking regions (on the contig) for conserved genes. • We will then locate the conserved genes in the Hinf genome and see if they are adjacent. • Since a wide net can be cast with BLASTn searches, this includes homologs.

  11. Start with a set of Hhae genes We found a (more) unique gene! Select a unique Hhae gene from the set YES NO Are the conserved genes adjacent or “close enough” in the Hinf genome? Reject gene and start over Search the set of common Hhae/Hinf (conserved) genes for genes in the flanking regions Get the locations of the conserved flanking genes in the Hinf genome NO Is there at least one conserved gene in each flanking region? YES

  12. PCR Assay: Results • Target 1 • PCR product • No duplication was found for these genes fatty acid/ phospholipid synthesis protein 50S ribosomal protein 1020 bp A B C D Hh NTHi A B D 170 bp Nucleic acid binding protein (hypothetical) 3-oxoacyl-(acyl carrier protein) synthase III 1250 bp Hh NTHi 380 bp

  13. PCR Assay: Results • Target 2 • PCR product 1451 bp A C E Hh B D NTHi 1 6 2 3 4 5 D A E purine nucleoside phosphorylase predicted membrane protein 1934 bp fructose-biphosphatealdolase 1451 bp Hh NTHi 1934 bp

  14. Target validation by Insilico PCR Step 1: Multiple Sequence Alignment by ClustalW2 - Overview 1 870 1775 2749 Non Typable H. influenzae 19 strains + 1 Typeable H. haemolyticus 5 strains Target 1 905 nts

  15. Neighbor Joining Tree Percentage of Identity using Jalview Step 2: Phylogenetic analysis

  16. Step 3: Finding primers 1 870 1775 2749 5’-CTCACTTACGCCACCACGTA-3’ Forward Non Typable H. influenzae 20 strains H. haemolyticus 5 strains 3’-TGCAACAATAATCAGTTCAATATCT-5’ Reverse

  17. In silico PCR Analysis Non Typable H. influenzae AAZD00000000 Product length: 487 H. haemolyticus M21621 Product length: 1354

  18. In silico PCR Analysis Sequence (5'->3') FORWARD

  19. MSA – Target 2 5372 1 Non Typable H. influenzae 20 strains H. haemolyticus 5 strains

  20. Goal 2 Comparative genomic analysis

  21. Horizontal Gene Transfer • Horizontal gene transfer (HGT), also lateral gene transfer (LGT) refers to the transfer of genetic material between organisms Alien Hunter • Predicts putative horizontally transferred regions. • Standalone software • Available at • Usage: • ./alien-hunter <input_file> <output_file> • INPUT: raw genomic sequence • PREDICTION: HGT regions based on Interpolated Variable Order Motifs (IVOMs)

  22. .sco file

  23. Last time, we got many hits with varied scores that covered almost 90% of the genes in each genome. Hence, we decided to place a threshold on the scores. • We studied the distribution of scores for each genome by plotting histograms for each genome based on the scores. • We decided to place a threshold of >70 after studying all the histograms. Screenshot of M21621

  24. HGT gene count

  25. Insertion elements • An Insertion element is a short DNA sequence that acts as a simple transposable element. • A transposable element (TE) is a DNA sequence that can change its relative position (self-transpose) within the genome of a single cell. The mechanism of transposition can be either "copy and paste" or "cut and paste".

  26. IS Finder

  27. FASTA sequences We retrieved FASTA sequences by submitting the accession IDs in NCBI

  28. BLAST We blasted these insertion sequences against each of the strains and got the location of the insertion sequences in the strain. A PERL script was written to extract the insertion sequences from their respective contigs in each strain.

  29. Comparative Analysis Table

  30. M19107 – Circular alignment using BRIG

  31. M19501 – Circular alignment using BRIG

  32. M21127– Circular alignment using BRIG

  33. M21621– Circular alignment using BRIG

  34. M21639 – Circular alignment using BRIG

  35. M21709 – Circular alignment using BRIG

  36. Goal 3 Identification and Characterization of Haemolysin in Hhae

  37. AIM #1 Look for the hemolysinBAoperon present in the H.haemolyticus strains and characterize it as present/absent in the hemolytic and non hemolytic strains

  38. HEMOLYSIN • Hemophilusducreyi, requires two adajecent genes, hhdB and hhdA for hemolysis . • hhdB is an outer membrane protein, which is required for secretion and activation of the hemolysin structural protein, hhdA. • Once secreted, hhdA interacts with target cell membranes, oligomerizes, and forms pores 2.5 to 3.0 nm in diameter, which lyse the target cell TWO PROTEIN SECRETION SYESTEM

  39. OUR STRATEGY • Downloaded the Fasta files of all hemolysin protein sequence of the Pasteurellaceaefamily from NCBI protein database. • Blasted the predicted protein sequences of the six strains against these. Cut off threshold: Identity 70% Coverage 80%

  40. RESULTS All hits had 70% and more identity and 95-100 coverage

  41. AIM# 2 • Characterize the domains/motifs/residue in hemolysin. • Depict the secondary structures in hemolysin. • Predict the 3D structure of hemolysin.

  42. SIGNAL PEPTIDE & HAEMAGGLUTINATION ACTIVITY DOMAIN Haemagglutination activity domain N’ terminal • A signal peptide (25 aa) to transport the hemolysin to outer membrane or periplasm. LipoP cleavage site Spase I at 25-26. NOT LIPOPROTEIN • Haemagglutination activity domain -suggested that the haemagglutination activity domain is a carbohydrate-dependent haemagglutination activity site which is found in a range of haemagglutinins and haemolysins Signal Peptide

  43. HAEMAGLUTININ REPEAT • Haemaglutinin repeat is a highly divergent repeat that occurs in number of proteins implicated in cell aggregation

  44. TPS DOMAIN All TPS-secreted proteins contain a distinctive N-proximal module essential for secretion, the TPS domain. TpsA proteins display two conserved regions, C1 and C2, and two less-conserved regions, LC region.ANPNL and NPNGIS is found in this region hemolysins/cytolysinsShlA of Serratiamarcescens, HpmA of Proteus mirabilis, EthA of Edwardsiellatarda, HhdA of Haemophilus ducreyi, the large supernatant proteins LspA1 and LspA2 of H. ducreyi, and the HecA adhesin of E. chrysanthemi . Clantin et al., 2004. The crystal structure of filamentous hemagglutinin secretion domain and its implications for the two-partner secretion pathway.PNAS.

  45. Does the TPS domain exist in H.haemolyticus strains? Fha30 H.H H.H H.H HhdA EthA ShlA HpmA LSpA1 LSPA2 21127 21621 19107 Fha30 EthA HpmA LspA1 LspA2 ShlA HhdA .

  46. ANPNL CONSERVED RESIDUES IN TPS DOMAIN NPNLGI NPNL & NPNGI These motifs form type I beta -turns, which might play important stabilizing roles. The conserved residues of the TPS domain serve to drive the folding of the TPS domain into a beta -helix and to stabilize the helix TPS HAD 39-159-Pfam Or TPS 39-270


  48. AIM #3 • Identify the domains in the hemolysin activator gene • Determine the secondary and 3D structure of hemolysin activator gene

  49. HEMOLYSIN ACTIVATOR PROTEINTRANSMEMBRANE PROTEIN MEMBRANE PROTEINS α-helical β-barrel β-barrel membrane protein class are located in the outer membrane of Gram-negative bacteria. These proteins have membrane spanning segments formed by antiparallel β-strands, creating a channel in the form of a barrel that spans the outer membrane.

  50. DOMAIN IS HEMOLYSIN ACTIVATOR POTRA_2 Activator Domain SP SP (LipoP) – SPI cleavage site between pos. 19 and 20. NOT LIPOPROTEIN POTRA_2- polypeptide-transport-associated domain. In ShlB this domain has a chaperone-like function over ShlA. Activator domain in ShlB is shown to interacts with ShlA during secretion and imposes a conformational change in ShlA to form the active hemolysin. ShlA/B: Serratiamarcescens