Progress report on Crank:Experimental phasingBiophysical Structural ChemistryLeiden University, The Netherlands
Crank developments available in CCP4 6.1 • “Greatly enhanced” – better tested • Underlying programs haven’t changed (much), but crank almost completely re-written from version in 6.0.2 • Better ccp4i interface • Support for more programs (PIRATE, BUCCANEER, RESOLVE, COOT) • Faster substructure detection • Use BP3 to (quickly) check trials and look at deviations between different CRUNCH2 trials significantly decreases the time required for successful substructure detection.
Preliminary substructure detection results from JCSG test cases • 144 mostly MAD Se-Met data sets • Defaults only: the only input was number of Se-Met per monomer (number of monomer was guessed). Mtz files, f’, f”. • Some data sets had f” < 1 (solved by MR) • Some data sets had incorrectly labelled X-PLOR files as mtz. • DISCLAIMER: 1st logfiles produced and analyzed yesterday after dinner (until 4 a.m.).
AFRO/CRUNCH2 vs SHELXC/D(both run in CRANK) Of the 79 jobs in common, crunch2 was faster in 20 jobs, while shelxd was faster in 59.
Comparison not fair • Same algorithm to identify solution with BP3 can be used in SHELXD • SHELXD uses much better Fa values (i.e. using the MAD data – at the moment, Afro just uses delta F from the data set with the greatest anomalous signal).
Improving FA values • An early step in solving a structure by SAD/MAD or SIRAS is to determine FA values. • FA is the structure factor amplitude corresponding to the substructure to input to direct methods and/or Patterson programs (i.e. SHELXD or CRUNCH2)
Current FA estimation • FA is currently estimated by | |F+| - |F-| | for SAD data in most programs. • Direct method programs are very sensitive to FA values. • Improving estimates can improve hit rates of direct methods and solve substructures that can not previously been solved.
Multivariate SAD equation • E(|FA|,|F+|,|F-|) = • |FA| P(|FA|, αA,| |F+|, α+,|F-|, α-) d|FA| dαA dα+dα- • Giacovazzo previously proposed multivariate FA estimation, with an implementation assuming Bijvoet phases are equal. • An equation can be obtained without the equal phase assumption requiring only one numerical integration. • The equation has been implemented – which reduces to Giacovazzo’s equation if Bijvoet phases are equal.
Covariance matrix properties • The covariance matrix considers experimental sigmas and correlations between F+, F- and FA. • Problem: Covariance matrix also depends on (overall) substructure occupancy and b-factor. • Solution: Obtain a multivariate likelihood estimate for unknown parameters.
Refining overall substructure parameters • Initial guess of number of substructure atoms per monomer obtained from user. • Initial guess of B-factor obtained from likelihood estimate of overall B-factor of data set. • Result: Refinement is stable and maximizes correlation with calculated final E’s. • Another possible application: Use refined overall occupancy and B-factor for anomalous signal estimation.
More robustness in difficult cases with CRUNCH2 • Using default parameters (resolution cutoff of 0.5 from the high resolution limit). * Can be solved with ΔE by using data to 1.5 Angstroms