1 / 17

SEQUENCE RETRIEVAL SYSTEM SRS

SEQUENCE RETRIEVAL SYSTEM SRS. Tuomas Hätinen. Motivation. Sequencing information. genetics. S tructural biology. molecular biology. medicine. physiology. toxicology. gene expression. Motivation. There are 3 main sequence retrieval systems: SRS (highly recommended)

Renfred
Télécharger la présentation

SEQUENCE RETRIEVAL SYSTEM SRS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SEQUENCE RETRIEVAL SYSTEM SRS Tuomas Hätinen

  2. Motivation Sequencing information genetics Structural biology molecularbiology medicine physiology toxicology gene expression

  3. Motivation • There are 3 main sequence retrieval systems: • SRS (highly recommended) • Entrez (easier to use but more limited) • DBGet (less recommended) • This is a workshop on using SRS • Start one of the servers below: • http://srs.ebi.ac.uk • http://csc-fserve.hh.med.ic.ac.uk/srs71 • http://walnut.bioc.columbia.edu/srs7/ • http://emb2.bcc.univie.ac.at:8080/srs/ • http://oryx.ulb.ac.be:8080/srs Full list of srs servers available from: http://downloads.lionbio.co.uk/publicsrs.html

  4. What is SRS?: Introduction • Central resource for molecular biology data • Data retrieval system - more than 250 databanks have been indexed. More than 35 SRS servers over the WWW • Data analysis applications server - 11 protein applications - 6 nucleic acid applications • Uniform query interface on the web

  5. What is SRS?: History • 1990 - Main author Dr. Thure Etzold • Development started in EMBL, Heidelberg • 1997 • Moved to EBI in Cambridge. Development work was supported by various grants amongst others from the EMBnet. • 1998 • Etzold and his group join LionBiosciences

  6. Why SRS? • Information retrieval • Easy way to retrieve information from sequence and sequence-related databases • Possibility to search for multiple words/other criteria • Linkage between different databases • E.g. Find all primary structures with known three-dimensional structure • ... and much more

  7. Why SRS?

  8. SRS construction

  9. Comments • SRS is both a simple and complicated tool with a number of features. • Can take a few days to get accustomed to. • We will run through some important features during the lecture. • We will apply these features as well as other new ones in the practical session.

  10. What can you do in SRS that you can’t do in UniProt • Sophisticated searches: eg wildcard searches, regexp searches • SRS consolidates multiple databases. • Many tools are available in SRS • Saving of projects • Why bother with Uniprot? Speed.

  11. Temporary Projects • Queries and views are stored by the project manager temporarily • Temporary sessions last 24 hours • Useful when you: • Do not need to keep your results • look something up quickly • Run an occasional application • Click on ‘Start’ paw on SRS start page

  12. Some examples /^glu/ will find terms beginning with ‘glu’ /ase$/ will find terms ending with ‘ase’ /c.t/ will find the words cat, cot, cut……. /c.*t/ will find terms beginning with ‘c’ and then any number of characters and ending with ‘t’ /sm[iy]th/ will find the words ‘smith’ or ‘smyth’ /rho[1-9]/ will find the word ‘rho’ followed by a number from 1-9 /mue?ller/ will find ‘muller’ or ‘mueller’ NB. The ‘*’ symbol has two meanings: -within forward slashes ‘/’ it means the preceding group may be repeated zero or more times - outside forward slashes it means any character

  13. SRS Query syntax • SRS indexes database records using a ‘word by word’ approach. - DE Human glutathione transferase • The SRS description index will contain terms ‘human’, ‘glutathione’ and ‘transferase’.

  14. Boolean operators • (&) AND : ‘human & glutathione & transferase’ • (|) OR: ‘human | glutathione | transferase’ • (!) BUTNOT : ‘human ! glutathione ! transferase’

  15. Wildcards • These are useful when: • Searching for a group of words (eg. Words starting ‘cell’ and ending ‘ase’ : cell*ase) • If unclear about how a word is spelt in a database • Two types: • * one or more characters of any value • ? Single character of any value • Any number of wildcards can be placed anywhere in a search word • Placing a wildcard at the start of a word or string may increase response time because all words in the index have to be checked against the string

  16. Regular expressions

  17. SRS Regular expressions • NB: Must appear within forward slashes (/) • Some operators: ^ marks the start of a string /^glu/ begins with ‘glu’ $ marks the end of a string /ase$/ ends with ‘ase’ . dot is any single character […] characters in square brackets are regarded as a set, any of which can be matched [0-9] specifies a range of 1 to 9 * the preceding group may be repeated zero or more times + the preceding group may be repeated one or more times ? The preceding character/group occurs one or zero times

More Related