1 / 45

RDP – Capturing the Unclassified

RDP – Capturing the Unclassified. Use only on data that can be publicly shared. These are not secure tools. Genboree RDP Output. Tutorial 2 Dataset QIIME chimeras removed RDP Sample Period. Download files. Raw.results.tar.gz. Unarchive and Decompress. Use 7zip Seq.fna.

diane
Télécharger la présentation

RDP – Capturing the Unclassified

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RDP – Capturing the Unclassified Use only on data that can be publicly shared. These are not secure tools.

  2. Genboree RDP Output • Tutorial 2 Dataset • QIIME • chimeras removed • RDP • Sample Period

  3. Download files • Raw.results.tar.gz

  4. Unarchive and Decompress • Use 7zip • Seq.fna

  5. Open in Bioedit

  6. In Bioedit: • Ctrl +A – to select all sequences • Shift + Ctrl + C – to copy all sequence titles • In Excel: • Paste into excel. In Column B (or other) • =left(a1,number_of_characters_in_titles) • Ctrl+Shift+Down arrow • Ctrl+D – to copy to all cells below • Check your work. Select only your samples. Do not select blank cells. Copy the correct titles.

  7. In Bioedit: • Paste Over titles • Save as: your_filename.fas • In the pull down menu • choose fasta

  8. rdp.cme.msu.edu

  9. Make an Account

  10. For very tiny datasets

  11. very tiny datasets

  12. very tiny datasets • Do not navigate away

  13. For pyrosequenced datasets

  14. You can navigate away and pick up the results later.

  15. Check in while running?

  16. Done: Download

  17. What do you get back? • Confidence file • Classifications • Failed classifications  Check this file. • Problems have happened if not empty. • Hierarchy

  18. Open classifications in excel • Focus on Phylum for tutorial. Use any level.

  19. Tutorial ease condense sample periods

  20. Keep it Tidy • Cut out what isn’t needed or being used.

  21. Confidence in the Classification • Sort on the confidence level • Odd groups • Leave in or take out? • Replace those below your confidence level • Unclassified_ • =concatenate($column$row,cell) • $ keeps the column or row static in your formula as you drag to multiple cells

  22. Copy to a new columnRemove Duplicates

  23. Even at the Phylum Level • 60 categorical levels • (could be 2 for every known phylum)

  24. To count by sample and phylum classification • =countifs($K:$K,$O2,$A:$A,P$1) • How to stop recalculation and manually restart – don’t crash your machine! You can easily cause hours of computation on large matrixes!

  25. Stop Automatic Recalculation • In the Options Menu • Under Formulas • F9

  26. Fill Formulas and Check Cells

  27. Copy Whole and Paste As Values

  28. Sum Rows and Sort On (Your Favorite) • Total is Customary • Can rearrange as needed

  29. Select Data and Titles Only

  30. Make a 100% Stacked Chart • Not very pretty

  31. Switch Perspectives

  32. Size Correctly

  33. To Compare to Genboree • RDP must be run • png.result.tar.gz

  34. What did we learn?

  35. What did we learn?

  36. Some Problems Commonly Encountered • Column formatting is not always followed with RDP output. • To get a clean graph with all taxonomic levels on one column, you may need to sort and remove sections of data. • Some have additional levels • Some have fewer levels of classification

  37. Additional Levels of Classification Move over Move over Delete Delete

  38. Fewer Levels of Classification Common Trouble Makers • Bacteroidetes • Verrucomicrobia • Acidobacteria • Dehalococcoidetes • Cyanobacteria • Chloroplast • Deltaproteobacteria • OD1_genera_incertae_sedis • TM7_genera_incertae_sedis • Armatimonadetes • WS3_genera_incertae_sedis Move Over

More Related