130 likes | 248 Vues
This presentation explores the transformation of data sharing practices in crystallography, emphasizing the importance of data description standards and quality control in publications. It highlights various curated databases, such as the Cambridge Structural Database and Protein Data Bank, alongside the increasing prevalence of data publication at the source. The role of the Crystallographic Information Framework and the significance of data validation tools are discussed to enhance transparency and accessibility of data for researchers. Insights into IUCr journals' publishing standards and community-driven dictionary initiatives are also shared.
E N D
Changing methods of data sharing in crystallography Editor-in-Chief Acta Crystallographica & Chair of the IUCr Journals Commission 1996-2005; IUCr Delegate to ICSTI 2005- Professor John R Helliwell Imperial College, June 28th, 2006 The University of Manchester john.helliwell@manchester.ac.uk
Content of presentation • Data description standards • Quality control in publication • Responsibility for quality control • Data quality standards • Data publication at source
Crystal structures ‘published’ • Curated databases • Cambridge Structural Database • Small organic/metal-organic: 335,280 : 29,000/yr • Protein Data Bank • Biological macromolecules: 34,506 : 5,500/yr • Inorganic Crystal Structure Database (82,676), CrystMet (99,893), Powder Diffraction File (240,050) • IUCr journals • Acta Crystallographica Sections C, E • Small-molecule, inorganic: 2357 articles/year • Acta Crystallographica Sections D, F • Biological macromolecules: ~ 120+ structural articles/year
Standard description of data • Crystallographic Information Framework • International Tables for Crystallography (2005). Vol. G, Definition and exchange of crystallographic data, edited by S. R. Hall & B. McMahon, 1st ed. Berlin: Springer. • CIF file structure • Hall, S. R., Allen, F. H. & Brown, I. D. (1991). The Crystallographic Information File (CIF): a new standard archive file for crystallography.Acta Cryst. A47, 655-685 • Dictionary definition language • Hall, S. R. & Cook, A. P. F. (1995). STAR dictionary definition language: initial specification.J. Chem. Inf. Comput. Sci.35, 819-825. • Data dictionaries
Data dictionary definition _refine_ls_R_Fsqd_factor Name:'_refine_ls_R_Fsqd_factor' Definition: Residual factor R(Fsqd), calculated on the squared amplitudes of the observed and calculated structure factors, for significantly intense reflections (satisfying _reflns_threshold_expression) and included in the refinement. The reflections also satisfy the resolution limits established by _refine_ls_d_res_high and _refine_ls_d_res_low. sum | F(obs)^2^ - F(calc)^2^ | R(Fsqd) = ------------------------------- sum F(obs)^2^ F(obs)^2^ = squares of the observed structure-factor amplitudes, F(calc)^2^ = squares of the calculated structure-factor amplitudes and the sum is taken over the specified reflections. The permitted range is 0.0 infinity Type: numb Category: refine
Quality control at source checkCIF: http://checkcif.iucr.org • Free public service • Sponsored by publishers and databases • Over 340 separate tests Described at http://journals.iucr.org/services/cif/datavalidation.html
Data publication increasingly ‘at source’ • Small-molecule crystallography often ‘high throughput’; thus only a subset of results get into the literature (?5 to 10%?) • There is a rise of local/national laboratory ‘data repositories’ • Examples: eBank (Southampton, UK + 5 other sites); Reciprocal Net (Indiana, USA + 18 other sites)
eBank • ePrints repository • OAI-PMH • Standard metadata • All data • Links to publication • Rights • Quality
Online Dictionary Project • Use wiki approach (à la Wikipedia) to realise community agreed dictionary terms • Pilot stage started September 2005 • Led by Emeritus Professor Andre Authier, Chair of the IUCr Nomenclature Commission
Summary • Quality of scientific argument depends on • Quality of data • Critical appraisal • Accessibility of relevant data • Precision of definitions • Rigorous analysis • IUCr publications strive to provide the highest quality in all these areas so as to inform the Editorial process including the peer review
Acknowledgements • Peter Strickland, Managing Editor at IUCr, Chester. • Brian McMahon, R&D Technical Development Officer at IUCr, Chester.