290 likes | 432 Vues
Questa presentazione può essere utilizzata come traccia per una discussione con gli spettatori, durante la quale potranno essere assegnate delle attività. Per memorizzare le attività durante la presentazione: In visualizzazione Presentazione diapositive fare clic con il pulsante destro del mouse
E N D
Questa presentazione può essere utilizzata come traccia per una discussione con gli spettatori, durante la quale potranno essere assegnate delle attività. Per memorizzare le attività durante la presentazione: • In visualizzazione Presentazione diapositive fare clic con il pulsante destro del mouse • Scegliere Appunti presentazione • Scegliere la scheda Attività • Immettere le attività a mano a mano che vengono assegnate • Per chiudere la finestra, scegliere OK • Questa procedura consente di inserire automaticamente le attività assegnate in una diapositiva che verrà visualizzata alla fine della presentazione. WP4Current status Paolo Romano & WP4 group VII EBRCN GM, Berlin, 26-27/09/2004
WP4 objectives Improved accessibility and interconnection • Links to external resources • Literature, Sequence, Special interest databases • Extracted databases • Available at interested SRS sites • Inventory of data and usage • Local and remote search, sites’ map VII EBRCN GM, Berlin, 26-27/09/2004
Links to external resources Literature Medline, Taxon Special interest Micro-organisms images Plasmids’ maps Sequences EMBL Data Library VII EBRCN GM, Berlin, 26-27/09/2004
Links to Medline Syntax: add [PMID: <number>] after bibliographic reference Links in place (> 7000): Plasmids: LMBP (375), NCCB (30) Cell lines: ICLC (294), DSMZ (905) Fungi: CBS (454) Yeasts: CBS (1132) Phages: NCCB (30) Literature reference file: DSMZ (3818) VII EBRCN GM, Berlin, 26-27/09/2004
Special interest databases Plasmids’ maps: Syntax: New FDS field: ‘External_links map <name>’ Links in place: Plasmids: LMBP (777) Images of micro-organisms: Syntax: New FDS field: ‘External_links image <name>’ Links in place: None (waiting for next catalogues’ update) VII EBRCN GM, Berlin, 26-27/09/2004
Linking to EMBL (i) • Linking “on-the-fly” to EMBL Data Library through SRS, without IDs, gave negative results: • Links are different for different materials and can use various EMBL fields: • Organism (micro-organisms), Division (viruses and plasmids), Feature Table (definition of the source through Key, Qualifier, Description) • Annotation problems (e.g., missing spaces) • Indexing problems (e.g., use of dots) VII EBRCN GM, Berlin, 26-27/09/2004
Linking to EMBL (ii) (well known) Example of search “on-the-fly”: • Searching for fil. fungi strain CBS 100.20Involves: fungi & source & cbs 100.20 ( ( ([emblrelease-FtKey:source] & [emblrelease-FtQualifier:strain] & ( ( [emblrelease-FtDescription:cbs] & [emblrelease-FtDescription:100] ) | [emblrelease-FtDescription:cbs100] ) & [emblrelease-FtDescription:20]) ) < [emblrelease-Organism:fungi*] ) VII EBRCN GM, Berlin, 26-27/09/2004
Linking to EMBL (iii) • Agreement with EBI • Identification of crossreferences from CABRI catalogues to EMBL (and viceversa) by unique IDs • Submission of the list to EBI • ID based links to CABRI included in EMBL data library and distributed with it • Use these links when linking from CABRI • Links from LMBP to EMBL managed differently VII EBRCN GM, Berlin, 26-27/09/2004
Linking to EMBL (iv) • Work started vs EMBL 79 • Common (new) SRS site for CABRI and EMBL • Modified indexing -> common keys format • SRS links established • Preliminar list of references sent to collections • Comments returned VII EBRCN GM, Berlin, 26-27/09/2004
Common site established VII EBRCN GM, Berlin, 26-27/09/2004
Common keys format CABRI indexing: by whole ID CBS 100.20 -> ‘CBS 100.20’ EMBL indexing: by single words CBS 100.20 -> ‘CBS’ + ‘100’ + ’20’ CBS100.20 -> ‘CBS100’ + ’20’ Common indexing: name (only letters), possibly followed by space, followed by string (including letters, numbers, dot, dash), punctuation removed CBS 100.20 -> ‘CBS10020’ CBS100.20 -> ‘CBS10020’ Special case (not currently managed): NCCB LMD and Phabagen bacteria catalogues VII EBRCN GM, Berlin, 26-27/09/2004
SRS links EMBL - CABRI #links Embl to Cabri Bact & Fun & Yeasts $Link: [from:$EMBLRELEASE_DB to:$BCCM_LMG_DB fromField:$DF_FtDescription toField:$DF_CABRI_Strain_number] $Link: [from:$EMBLRELEASE_DB to:$CBS_BACT_DB fromField:$DF_FtDescription toField:$DF_CABRI_Strain_number] VII EBRCN GM, Berlin, 26-27/09/2004
Automatic identification of links VII EBRCN GM, Berlin, 26-27/09/2004
Custom views (i) VII EBRCN GM, Berlin, 26-27/09/2004
Custom views (ii) VII EBRCN GM, Berlin, 26-27/09/2004
Links to EMBL: current status Almost ready for submission of the list of crossreferences EBI objection: many, some little, databases, instead of a big one New proposal from EBI Links added in the SRS site at EBI only Links not serchable Links not distributed with EMBL Data Library Alternative proposals from us Making CABRI virtual catalogues by resource type (bacteria, cell lines,…) Making an interrnediate database VII EBRCN GM, Berlin, 26-27/09/2004
SRS virtual libraries SRS Virtual libraries Include many member libraries Appear and can be searched as a unique database Use indexes of member libraries Member libraries must have a common data structure CABRI Virtual libraries Can be created for each resource type Interconnected Bacteria DB Interconnected Cell Lines DB May be created for similar resource types Interconnected Micro-organisms DB VII EBRCN GM, Berlin, 26-27/09/2004
Intermediate database Intermediate CABRI database would Include very limited infomation: identification and name Be linked by EMBL and link to the related CABRI catalogue EMBL -> Intermediate db -> CABRI Example: Identification CIP 70.34 Name Acinetobacter baumannii Identification ECACC 88020401 Name Vero Identification LMG 3589 Name Bacillus subtilis (Ehrenberg 1835) Cohn 1872 AL VII EBRCN GM, Berlin, 26-27/09/2004
Extracted databases • Intended to improve accessibility of CABRI catalogues by distributing them in a controlled frame • Inlude a subset of information:CABRI MDS + link to CABRI site (new field Full_details) • Established agreement with EBI • Preparation of extracted databases: • Setting up of a purpose Web site: http://export.cabri.org/ • Setting up of an FTP site for distributing data and SRS configuration files: ftp.cabri.org (not anonymous) • Upload of catalogues to EBI: march 2004 • Automatic updating by FTP through SRS Prisma VII EBRCN GM, Berlin, 26-27/09/2004
Catalogues at EBI VII EBRCN GM, Berlin, 26-27/09/2004
CABRI views in place VII EBRCN GM, Berlin, 26-27/09/2004
Link to CABRI for details & orders VII EBRCN GM, Berlin, 26-27/09/2004
Quick searches at EBI (i) VII EBRCN GM, Berlin, 26-27/09/2004
Quick searches at EBI (ii) VII EBRCN GM, Berlin, 26-27/09/2004
Inventory of data usage and sets • GlobalSearch on CABRI site available • GlobalSearch on partners’ sites • Not stable • Partial (give me URLs!) • Virtual BRCs’ Library • Map of sites’ maps • Includes links to archives/databases • PLEASE SUBMIT YOUR DATA! VII EBRCN GM, Berlin, 26-27/09/2004
That’all, folk! • MEDLINE • Links to Medline already in place for many catalogues • New links added with periodical updates • EMBL • Common site and index keys in place • Implementation of links under study with EBI staff • Other external links • Plasmids’ maps in place • Micro-organisms images ongoing • Extracted databases • Procedure implemented • Purpose web and ftp sites available • Uploaded to EBI march 2004 • Inventory of data usage and data sets • Search on partners’ site contents (ht://dig) soon available • List of partner’s site contents (sort of “Map of sites’ maps”) under construction VII EBRCN GM, Berlin, 26-27/09/2004
Thoughts about the future (i) • CABRI as it is • Many links to external databases are being set up and are already in place for some of the catalogues • Extracted databases have been uploaded to EBI • Integration made possible (mainly) because of the adoption of SRS • CABRI sites are now well known, appreciated and use network services • GBIF perspective • GBIF has designed a nice and innovative architecure • Distributed architecture can help management by avoiding conversions and updates • It requires a sound expertise and good computer skills, not always available at collections/BRCs • The ABCD Schema is not adequate for catalogues’ contents VII EBRCN GM, Berlin, 26-27/09/2004
Thoughts about the future (ii) • We need to keep current and set up new links • Current links with the molecular biology world should be kept • SRS is an essential key for this connection • Web Services based GBIF architecture must be taken into account for the future links with the (quickly) evolving biodiversity information environment • SRS is evolving • Since SRS 6, XML has been incorporated • With SRS 7, XML is essential (alternative to flat files) • With SRS 8, Web Services have been added and SRS itself able to provide Web Services and to access them remotely VII EBRCN GM, Berlin, 26-27/09/2004
Thoughts about the future (iii) • Proposal • Start by extending the ABCD Schema to reach our needs • Continue with SRS and follow its evolution • Adopt as early as possible the new SRS Web Services features and start offering information to GBIF • Individual collections/BRCs willing to go autonomous can stop submission of data, provided they offer an agreed interface for remote access by the central SRS based system • Finally, reach a mix distributed/centralized architecture, based on SRS and offering both standard SRS services and Web Services VII EBRCN GM, Berlin, 26-27/09/2004