1 / 37

InstantJChem: a flexible chemical database system

InstantJChem: a flexible chemical database system. G. Marcou, D. Horvath + Laboratoire d ’ infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg. Introduction. The goal is to present InstantJChem for the storage and manipulation of chemical information

cole
Télécharger la présentation

InstantJChem: a flexible chemical database system

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. InstantJChem: a flexible chemical database system G. Marcou, D. Horvath +Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg

  2. Introduction • The goal is to present InstantJChem for the storage and manipulation of chemical information • General presentation • Database search • Creation of a database from scratch

  3. What is a database? • A database stores data in an ordered form on a precise subject. • A relational database stores information into tables which possess inter-references • A relational database management system (RDBMS) is a software that manages relational databases • InstantJChem is not a database and is not an RDBMS.

  4. What is InstantJChem? • InstantJChem is a friendly interface between a RDBMS, chemical information and the user. User RDBMS Chemical Information

  5. Key concepts of InstantJChem Projects Schema Databases and Tables Entities Data Trees Views

  6. Exercise 1 Create a new project names IJCExercises…

  7. Key concept: Project Project contains resources and connections to one or more databases. icon

  8. Exercise 1 …and import the file SC100.SDF in it….

  9. Key concept: Schema Schema/ Database Contains connection to a database and special tables (JChemProperties) icon

  10. Key concept: Database and Tables Table Database and tables are managed by the RDBMS. Actually store information. icon

  11. What can be stored

  12. Key concept: Entities Entity An entity is a representation of data. icon It is a unique interface to conceptually different types of tables (Standard, Chemical, SQL, Extractions, etc).

  13. Key concept: Data Trees Data Tree A collection of entities and views. icon Organize information using a hierarchy (parent-child relationship between entities).

  14. Exercise 1 ….Customize a browser for it.

  15. Key concept: Views Views An interface to data. icon For simple data, a spreadsheet view is relevant. For complex relational data, a form is mandatory.

  16. Exercise 2 In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search.

  17. Exercise 2 In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search. Substructure search: 20 hits Similarity search: 0 hits Substructure search: 14 hits Similarity search: 0 hits Similarity search uses Chemical Hashed Fingerprints defined at database creation.

  18. Chemical Hashed Fingerprints (CHF) • Pattern Length: number of bonds of a pattern • Fingerprint Length: total number of bits to store the fingerprint • Bits per pattern: number of bits a pattern shall set on www.chemaxon.com Efficient annotation to accelerate structure search

  19. Exercise 3 Combine molecule 25 and 89 into a pseudo-molecule to perform a superstructure query.

  20. Exercise 4 Use compound 46 as a Full and Full fragment query to search the database. Repeat after removing the bromide from the query.

  21. Structure Searches www.chemaxon.com

  22. Exercise 5 Search benzene containing compounds, which name contains “pyrimidin” and annotated as “Good” concerning their aqueous solubility.

  23. Exercise 6 Search for compounds with at least one aromatic ring containing at least on Nitrogen atom

  24. Exercise 7 Search for compounds which MolWeight > 200 and not containing a benzene ring

  25. Exercise 8 Search for compounds with MolWeigh > 200, then for compounds without a benzene ring and search for the union of the hit lists.

  26. Execrise 9 Search for compounds possessing more than 4 microspecies at pH=4.0….

  27. Exercise 9 … Export your hit list.

  28. Exercise 10 Import in your project the file ISICCRsm.RDF…

  29. Exercise 10 … Create a Browser for this database

  30. Exercise 11 Search for reactions including an imidazole ring into their reactants then into their products.

  31. Exercise 12 Add to your Schema a new data tree and structure entity named AlkanBoilingPoint…

  32. Exercise 12 … and add a floating point value field named BoilingPoint.

  33. Exercise 13 Add to the AlkanBoilingPoint entity the following data.

  34. Exercise 14 Add to the AlkanBoilingPoint entity a new date field named Date and fill it.

  35. Exercise 15 Add to the AlkanBoilingPoint entity a calculated value of LogP using a Chemicalterm field.

  36. Summary • Create a project and schema • Import data • Search by substructure, superstructure, similarity, and exact match • Search by keyword • Combining queries and result lists • Export query results • Create a new database

  37. Conclusion • InstantJChemis a Chemoinformatics layer above a standard SGDB. • Provides many more Chemoinformatics services (databases overlap, QSPR modeling, plots, enumeration, scripting) SGDB InstantJChem

More Related