Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc.

Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc. Protein Structure Determination Lecture 9

Selenomethionine MAD Selenomethionine is the amino acid methionine with the Sulfur replaced by a Selenium. Selenium is a heavy atom that also has the propery of "anomalous scatter" at some wavelengths, and not at others. Proteins grown in sMet will incorporate teh Se atoms. These crystals can be solved by Multiwavelength Anomalous Dispersion (MAD). 3 data sets are collected on a single crystal (sometimes at the same time!) F1PH, F2+PH , F2+PH

bound electrons free electrons Heavy atom Anomalous dispersion Inner electrons scatter with a phase shift relative to the phase of the free electrons. An anomalous scatter at the Origin scatters with phase . ∆Fi phase shift of anomalous scatter imaginary part of anomalous scatter  ∆Fr real part of anomalous scatter

Reminder: Friedel's Law F(h k l) F(-h -k -l) F+(h k l) Friedel mates have same amplitude, opposite sign.   F-(h k l)

Anomalous dispersion violates Friedel's Law F(h k l) F(-h -k -l) ∆F+(h k l)  + and - contributions for just a heavy atom with anomalous scattering.  ∆F-(h k l)

MAD experiment Protein is grown in sMet (or another anomalous scatterer ispresent, such as Iron). Three data seta are collected on 1 crystal at multiple (usually 2) wavelengths. 1 is a wavelength where Se does not absorb, normal diffraction. 2 is a wavelength where Se does absorb, anomalous ∆F is added. F-1(h k l) Friedel mates (same amplitude, merged) collectednon-anomalous wavelength. F+1(h k l) F-2(h k l) Friedel mates (different, not merged) collectedanomalous dispersion wavelength. F+2(h k l)

Friedel mates with anomaloushave different amplitudes F-2(h k l) F-1(h k l) F+2(h k l) F+1(h k l)

Friedel mates with anomaloushave different amplitudes F+2(h k l) F-2(h k l)* F1(h k l)

Harker diagram with anomalous F1(h k l) F-2(h k l)*

Harker diagram with anomalous F+2(h k l)

Harker diagram with anomalous F1(h k l) F+2(h k l) F-2(h k l)*

Phase probability distribution Radii are Fp and k*Fph Width are p and k*ph The red area are the places in Argand space where both FP and FPH-FH can be

Most probable versus best phase The most probable phase is not necessarily the “best” for computing the first e-density map. weighted average, best phase Shaded regions are possible Fp and Fph solutions.

Figure of merit Figure of merit “m” is a measure of how good the phases are. C is the “center of mass” of a ring of phase probabilities (probability is the “mass”). Assume the radius of the ring is 1. If the probabilities are sharply distributed, m≈1. If they are distributed widely, m is smaller. Fbest(hkl) = F(hkl)*m*e-iabest

Centric reflections • If the crystal has centrosymmetric symmetry, all reflections are centric, maving phase = 0° or 180° • If the crystal has 2-fold, 4-fold or 6-fold rotational symmetry, then the reflections in the 0-plane are centric. (Because the projection of the density is centrosymmetric) • For centric reflections: • |Fph| = |Fp| ± |Fh| This means the amplitude |Fh| is exact* for centrosymmetric reflections. *assuming perfect scaling.

F's that are Syms AND Friedel mates: centric reflections Example: F(0 k l) and F(0 -k -l) if a is a 2-fold. R R Draw any set of Bragg planes parallel to the 2-fold. Project the density onto a line. Notice: The projected density is centrosymmetric. Therefore, phase can only be 0 or 180°.

Heavy atom phasing methods SIR = single isomorphous replacement, without anomalous. Fourier transform uses: Figure-of-merit weighted amplitudes, alpha-best phases and centric reflections. MIR = multiple isomorphous replacement, without anomalous. Same Fourier terms, but Figure-of-merit is generally better than for SIR.

Heavy atom phasing methods SAD = single wavelength anomalous dispersion. Phases from F+ and F- from one crystal. Fourier transform uses: Figure-of-merit weighted amplitudes, alpha-best phases and centric reflections. MAD = multi-wavelength anomalous dispersion. Phases from three datasets from one crystal at 2 wavelengths (or more). F+, F- at anomalous wavelength, and F at non-anomalous wavelength.

In class exercise: phase error FP=5.00 s=0.5 FPH1=5.50 s=0.8 FH1=2.23 aH1=-63.4° FPH2=4.50 s=0.9 FH2=0.50 aH2=-164° (1) Draw three circles separated by vectors FH1 and FH2. (2) Draw circular “error bars” of width 2s. (3) Draw circle plot of Fp phase probabilities. (4) Estimate the centroid c of probabilit. (5) What is the Figure of Merit, m?

Is the initial map good enough? (1) The map is calculated using abest. (2) The map is contoured and displayed using {InsightII, MIDAS, XtalView, FRODO, O, ...} (3) A “trace” is attempted.

Model building e- density cages (1 s contours) displayed using InsightII

Information used to build the first model: Sequence and Stereochemistry ...plus assorted disulfide and ligand information. Models are built initially by identifying characteristic sidechains (by their shape) then tracing forward and backward along the backbone density until all amino acids are in place. Alpha-carbons can be placed by hand, and numbered, then an automated program will add the other atoms (MaxSprout).

Class exercise: Tracing an electron density map sequence: AGDLLEHEIFGMPPAGGA Can you locate the density above in the sequence?

In a web browser window on Linux goto: www.rcsb.orgSearch for: 1dfnClick on Save as: 1dfn.pdb DeepView In a web browser on Linux goto: eds.bmc.uu.se/eds/Enter PDB code: 1dfnDownload, MapsFormat-->CCP4, Type-->2mFo-DFc, Generate Map, Download linked file.In Linux terminal window: gunzip 1dfn.ccp4.gz In a Linux terminal window type (make an alias!):/afs/rpi.edu/dept/bio/pub/SPDBV/bin/spdbv.LinuxOpen PDB file: 1dfn.pdbOpen electron density map (CCP4): 1dfn.ccp4Rotate using mouse.

R-factor: How good is the model? Calculate Fcalc’s based on the model. Compute R-factor Depending on the space group, an R-factor of ~55% would be attainable by scaled random data. The R-factor must be < ~50%. Note: It is possible to get a high R-factor for a correct model. What kind of mistake would do this?

What can you do if the phases are not good enough? 1. Collect more heavy atom derivative data 2. Try density modification techniques. initial phases Density modification : Fo’s and (new) phases Fc’s and new phases Map Modified map

Density modification techniques Solvent Flattening: Make the water part of the map flat. (1) Draw envelope around protein part (2) Set solvent r to <r> and back transform.

Solvent flattening Requires that the protein part can be distinguished from the solvent part. BC Wang’s method: Smooth the map using a 10Å Guassian. Then take the top X% of the map, where X is calculated from the crystal density.

Skeletonization (1) Calculate map. (2) Skeletonize it (draw ridge lines) (3) Prune skeleton so that it is “protein-like” (4) Back transform the skeleton to get new phases. Protein-like means: (a) no cycles, (b) no islands

R R R R Non-crystallographic symmetry If there are two molecules in the ASU, there is a matrix and vector that rotate one to the other: Mr1 + v = r2 (1) Using Patterson Correlation Function, find M and v. (2) Calculate initial map. (3) Set r(r1) and r(r2) to (r(r1) + r(r2) )/2 (4) Back transform to get new phases.

What does a good map look like? plexiglass stack brass parts model Before computers, maps were contoured on stacked pieces of plexiglass. A “Richards box” was used to build the model. half-silvered mirror

Low-resolution At 4-6Å resolution, alpha helices look like sausages.

Medium resolution ~3Å data is good enough to se the backbone with space inbetween.

The program BONES traces the density automatically, if the phases are good.

BONES models need to be manually connected and sidechains attached. MaxSPROUT converts a fully connected trace to an all-atom model.

Errors in the phases make some connections ambiguous.

Contouring at two density cutoffs sometimes helps

Holes in rings are a good thing Seeing a hole in a tyrosine or phenylalanine ring is universally accepted as proof of good phases. You need at least 2Å data.

Can you see in stereo? Try this at home. In 3D, the density is much easier to trace.

New rendering programs “CONSCRIPT: A program for generating electron density isosurfaces for presentation in protein crystallography.” M. C. Lawrence, P. D. Bourke

Great map: holes in rings

Superior map: Atomicity Rarely is the data this good. 2 holes in Trp. All atoms separated.

Only small molecule structures are this good Atoms are separated down to several contours. Proteins are never this well-ordered. But this is what the density really looks like.

Refinement • The gradient* of the R-factor with respect to each atomic position may be calculated. • Each atom is moved down-hill along the gradient. • “Restraints” may be imposed. *

What is a restraint? A restraint is a function of the coordinates that is lowest when the coordinates are “ideal”, and which increases as the coordinates become less ideal.. Stereochemical restraints also... planar groups B’s bond lengths bond angles torsion angles

Calculated phases, observed amplitudes = hybrid F's • Fc’sare calculated from the atomic coordinates • A new electron density map calculated from the Fc's would only reproduce the model. (of course!) • Instead we use the observed amplitudes |Fo|, and the model phases, ac. Hybrid back transform: Hybrid maps show places where the current model is wrong and needs to be changed.

Difference map: Fo-Fc amplitudes The Fo “native” map r(Fo) differs from the Fc map r(Fc) in places where the model is wrong. So we take the difference. In the difference map: Missing atoms? r(Fo-Fc) > 0.0 Wrongly placed atoms? r(Fo-Fc) < 0.0 Correctly modeled atoms? r(Fo-Fc) = 0.0 Q: Subtracting densities (real space) is the same as subtracting amplitudes (reciprocal space) and transforming. T or F?

2Fo-Fc amplitudes The Fo map plus the difference map is Fowhere the differences are zero (the atoms are correct) Less than Fowhere the model has wrong atoms. Greater than Fowhere the model is missing atoms. Fo + (Fo-Fc) = 2Fo-Fc

The “free R-factor”: cross-validation The free R-factor is the test set residual, calculated the same as the R-factor, but on the “test set”. Free R-factor asks “how well does your model predict the data it hasn’t seen?” Note: the only difference is which hkl are used to calculate.

Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc.

Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc.

Presentation Transcript

Molecular Replacement

Anomalous

anomalous

Etc. etc. etc.

Multiwavelength Sky

ANOMALOUS

MOLECULAR ENDOCRINOLOGY AND IMMUNOLOGY ACTH, MSHs etc

Molecular Replacement: “Practical” Application

Population Ecology Density and Dispersion Dispersion Spatial distribution of organisms

Improving molecular replacement with morphing

Molecular Replacement

A stochastic approach to Molecular Replacement.

Dispersion:

Body Modification, tattooes, piercing, scarification etc

anomalous

Scaling, Phasing, Anomalous, Density modification, Model building, and Refinement

Automated Molecular Replacement

Density and Dispersion Information

Dispersion

Auto molecular replacement