1 / 1

Controller

The EuPathDB / GUS-WDK Search Strategy System

ponce
Télécharger la présentation

Controller

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The EuPathDB / GUS-WDK Search Strategy System Cristina Aurrecoechea1, Brian P. Brunk2, Steve Fischer2, Xin Gao2, Omar S. Harb2, Mark Heiges1, Jessica C. Kissinger1, Eileen T. Kraemer1, Cary Pennington1, David S. Roos2, Chris Ross1, Christian J. Stoeckert2 & Charles Treatman2 1Univ. Georgia, Athens GA, & 2Univ. Pennsylvania, Philadelphia PA The EuPathDB suite of genome database web sites recently introduced a graphical search interface that motivates users to undertake dynamic computational experiments, exploring relationships across datasets to identify biologically meaningful genes and other entities. For example, users seeking novel therapeutic targets may wish to prioritize putative enzymes that distinguish pathogens from their hosts, and are expressed during appropriate developmental stages. Strategies are initiated by running one of 80+ queries, and extended by adding additional searches, linked via Boolean operators represented graphically as Venn diagrams. Sub-strategies allow modular construction and tree structures, and searches may be extended using filters (e.g. by strain or species) and transforms (e.g. orthologs). A graphical display makes the overall logic obvious, and facilitates revision of individual steps, with changes propagated forward through the strategy. Users may nameand save their strategies, creating protocols that can be shared with colleagues. (See, e.g., http://plasmodb.org/plasmo/im.do?s=2aa0454db6a6cca0.) The strategy system has been subjected to extensive usability studies, and deployed on all EuPathDB databases (CryptoDB, GiardiaDB, PlasmoDB, ToxoDB, TrichDB and TriTrypDB). Although these sites have offered text-based Boolean operations for many years, usability analysis indicated that most users were not taking full advantage of that feature. Following release of the graphical Search Strategy system, the number of searches per visit dramatically increased. Response from our user community has been extremely positive, as investigators have discovered the power of combining datasets and making dynamic adjustments to define optimal parameters and highlight biologically-relevant relationships. With the accelerating growth in diversity and scale of available datasets, the potential for exploiting interrelationships increases dramatic­ally, and we expect this interface to have a significant impact in bringing “genomic thinking” to a broad audience. This system was developed using the GUS Web Development Kit (WDK), a schema-independent middleware system for generating genomics websites Challenge: exploit the power of integrated genome annotation, expression data, proteomics data, SNPs, etc. Solution: Strategies… A GraphicalQuery Interface for Genomics Databases Use Case Use data in PlasmoDB to find parasite (Plasmodium) drug target genes This panel shows a schematic of a strategy, using queries and booleans. The actual strategy is built below. Transferases (E.C.) [union] Kinase activity (GO) [intersect] --------------------------------------------------------------------------- [intersect] present in Haemosporida, not Mammals [intersect] not under diversifying selection (SNPs) [transform] orthology to any Plasmodium genes The EuPathDB suite of databases covers genomic and functional genomics datasets for a variety of eukaryotic pathogens. Shown here is PlasmoDB, which contains the genus Plasmodium, including P.falciparum, the malaria parasite. # Nested Strategy P.f. transcript expr. at 24 hours +/- 8 [union] P.f. transcript expr. in Trophozoites [union] P.f. protein expr. in Trophozoites Build a Strategy It’s Easy to Build a Strategy… 1 2 Add a step (another query) Run a query (choose from menu) 3 Add more steps… …Strategies are Powerful 4 Dynamically revise, add or delete steps. A strategy can integrate data from genome annotation, expression, SNPs, proteomics, etc. Save and browse strategies. Different types of strategies: Genes, Isolates, SNPs, Transcript assemblies, Chromosomes, Array Elements, ORFs, etc. Email a strategy link tocollegaues. Use orthology to transform results to other species. Revise steps at any time…. Changes propagate forward. Download customized reports of results. Nest strategies to add complexity. Choose from many available columns. Sort and move columns. View results from all or any species. View (web) WDK Sanity Test Model • WDK Implementation • Runs on any relational database schema • Model: configured by you in XML. • Abstracts DB to high level Records (Genes, ORFs, etc) • Also specifies queries and returned columns • Automated sanity testing • Can talk to processes (BLAST) via a WS Framework • View: Tomcat, JSP, tag library, JavaScript, Ajax, CSS • You embed JSP tags in your site and style them w/ CSS • Controller: Struts • User perspectives on Strategies • Computer-human interaction (CHI) studies during prototyping drove the design, and showed high user enthusiasm. • Usage stats show 3-fold increase in use of Booleans in two months since release. • User feedback very positive. JSP and CSS Genomics Database WDK Model (XML) Genomics Data WDK Model (Java Objects) JavaBeans (JSP compatible) JSP Tag Library Genomics Data Denormalized For Query Speed Web Services Framework Strategies Web Dev Kit (WDK) www.gusdb.org/wdk WDK Engine Query Cache Controller WDK Query Engine (Java) Struts controller • WDK Upcoming features • Add genes to a “basket” to generate a report, add to a strategy as a step or send to a tool (e.g., multiple sequence alignment) • Web services access to queries • Assign weights to results from individual steps for improved filtering • Transform a set of one type into another type based on genome span relations Processes (eg, BLAST) EuPathDBis an NIAID Bioinformatics Resource Center Supported by NIAID Contract No. HHSN266200400037C and The Bill & Melinda Gates Foundation User Login and Search History = You provide = WDK provides = Optional

More Related