1 / 9

Bioinformatics Workshop 2 Identifying Unknown Genes …

Bioinformatics Workshop 2 Identifying Unknown Genes …. Open a web browser and type in the URL: informatics.gurdon.cam.ac.uk/online/workshops Bookmark this page Click on the link to the file: useful-websites.html Bookmark this page too

cindij
Télécharger la présentation

Bioinformatics Workshop 2 Identifying Unknown Genes …

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics Workshop 2Identifying Unknown Genes … • Open a web browser and type in the URL: • informatics.gurdon.cam.ac.uk/online/workshops • Bookmark this page • Click on the link to the file: • useful-websites.html • Bookmark this page too • It also contains links to the example sequence files used in the workshop, and the presentations themselves

  2. Genome Browsers Now that most model organisms have had their genomes sequenced, we can get a lot more information about how the gene works, than by just doing a BLAST search against the protein databases. Even if ‘your’ favourite genome is still just in ‘scaffolds’ and not yet assembled into chromosomes, we can still add a lot of value. The main tasks that one does to a genome before releasing it to the user community is to annotate it. In practice this means adding gene models, based on known expressed sequences, both in the same organism and other fairly closely related ones, and possibly also purely predicted ones based on sequence composition analysis and ‘features’ like start and stop codons, and splice sites. And then known mapping markers, SNPs, etc, etc. With ~3,000,000,000 nucleotides in the genome sequence (human) this present a considerable challenge to display on a web browser page, which is of course the preferred option. Most genome browsers (software designed to display genome based data in a web broswer) have taken roughly the same approach, which we’ll take a quick look at…

  3. Gene model gene model genome Aligned cDNA Aligned ESTs

  4. 24000 25000 27000 26000 + navigate zoom - Schematic Genome Browser Mus musculus, chromosome 12 genome TRACKS Your sequence Genes ESTs

  5. How to Use UCSC Browser

  6. Exercises 1. Find the web site for the Santa Cruz Genome Browser (sometimes called the Golden Path), and investigate the three genes for which you have the full length cDNA sequence, or the protein sequence, in the file example-sequences.html >TNeu084i05 How many exons does the gene appear to have? Has it been mapped already? Are there any likely upstream regulatory elements (look for conservation across species)? Are there other genes near by? >TGas122d03 Is this a relatively unique gene, or a member of a gene family? What can we learn from the comparison with human genes? Are there any differences between the gene model predicted from your cDNA, and the existing predictions? >hsp70-5 Starts with the protein sequence. How might this be better?

  7. Exercise 1. Results >TNeu084i05

  8. Exercises 2. Now go to the two other main genome browsers, Ensemble and NCBI – find the Xenopus genome, and see if you get the same sort of functionality from them. Use the same two sequences. Are there different features? Are they easier/harder to use?

More Related