Requirements of a Taxonomy Database Tcl-DB a Prototype

Requirements of a Taxonomy DatabaseTcl-DB a Prototype

Outline • Requirements • Hierarchy • Alternative Search Terms: Synonyms and Vernaculars • Alternative Spellings • Alternative Classifications • Tcl-DB Prototype System • Tcl-DB Structure • 2NF • Extensibile: Adding a new data source e.g. NCBI • Tcl-DB: UID Tracking • Tcl-DB: Stats • Utility and Further Work

1. Hierarchy

2. Alternative Search Terms: Synonyms and Vernaculars

3. Alternative Spellings: Caenorabditis elegans, C elegans and Caenorhabditis elegans

4. Alternative Classifications:

Tcl-DB Prototype System. Proposed Architecture

Tcl-DB: Logical Structure

Tcl-DB Physical Database Structure

Assertion: Resolving the M:M with an association entity

Node: Hierarchical Queries Nested Set, Path and Connect by >select count(name_id) from node start with name_id = ‘100891' connect by prior name_id = parent_name_id; >select count(name_id) from node where path like '/%'; >select count(name_id) from node where left_id between 1 and 9290;

synonym_name and vernacular: subtypes,multi-valued attributes or weak entities

Tcl-DB: 2NF

Tcl-DB: Procedures, Packages and Functions: Adding a new data source e.g. NCBI

Step 1: Build Views, what names are already in the database

Step 2: Move names from view to Tcl schema

Step 3: Fill the nodes table in tcl schema

Step 4: fill synonym_name table in tcl schema Step 5: fill vernacular table in tcl schema

Tcl-DB: UID Tracking • after name data load: • Run two joins on name and nids_mv • Nids – name_id when the name_text exist • Null – name_id when the name_text not exist • Update name and give all new names a NID • Update name give all names their original NID • Refresh the NID_view

Tcl-DB: Utility and Further Work • Computing Interesting Stats: • How much overlap between ITIS and NCBI? • How many names unique to NCBI? • How many of these are binomials Vs ‘environmental sample 256’ • How many of these names can be matched allowing for 1 – 3 letter mismatches. • NCBI taxonomy – data quality, Integrity and Usability? • Transitively closing the Synonyms Table and Vernacular Table • Building an interface. • Spell checkers

Lots of Questions?How do we use this to build taxonomically aware databases?How about updates to the data?Database links , Web services, Simple DB Cross References?Use Genbank Model?Open to Suggestions/Ideas!Do we need to think about:PhyloCode?Type Specimens?

Requirements of a Taxonomy Database Tcl-DB a Prototype

Requirements of a Taxonomy Database Tcl-DB a Prototype

Presentation Transcript

Requirements for A Taxonomy Management System

A TAXONOMY OF PRIVACY

Senbazuru : A Prototype Spreadsheet Database Management System

A Prototype “Taxonomy” for Enforcement of Spectrum Usage Rights

A Taxonomy of Computer Worms

A new taxonomy?

A Taxonomy of privacy

A taxonomy of race conditions

A Taxonomy of Financial Assets

A Taxonomy of ETL Activities

A Taxonomy of Privacy Law

Database Requirements

A Taxonomy of Research Design

The NICE taxonomy: a case study of developing a corporate taxonomy

A taxonomy of memory disorders

A Taxonomy of Communication Networks

A Taxonomy of Adaptive Testing

Creating a Database Using DB/ TextWorks

A Database of

A Taxonomy of Computer Worms

A taxonomy of granular partitions

A Taxonomy of Web Searches