Pseudocontact shift analysis

From Spinach Documentation Wiki
Revision as of 04:06, 6 January 2017 by Suturina (talk | contribs) (Distributed paramagnetic centre model)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This is a tutorial page that goes through the different types of pseudocontact shift analysis that are available in Spinach. It also summarises basic facts from the theory of pseudocontact shift. Please refer to our recent publications on the subject for the technical details of the three-dimensional PCS analysis process, and to standard reviews (from Florence and Otting groups) for the point PCS model description.

Introduction

The presence of rapidly relaxing unpaired electrons generates extra nuclear chemical shifts (called paramagnetic shifts) because the unpaired electrons respond nearly instantaneoulsy to the changes in the applied magnetic field and therefore provide additional nuclear shielding that is much stronger then what would be available from a closed-shell electronic structure. Paramagnetic chemical shift has two contributions:

  1. Contact shift, which arises from the isotropic part of the hyperfine coupling between the electrons and the nuclei located in their close vicinity (typically less than 5 Angstrom).
  2. Pseudocontact shift, which is generated by the long-range (up to 50 Angstroms) electron-nuclear dipolar interaction.

Both contributions may be obtained from classical physics expressions if the electron-nuclear hyperfine coupling tensor and the electron magnetic susceptibiity tensors are known. The contact shift depends on the isotropic part of the magnetic susceptibility tensor, and the pseudocontact shift on the anisotropic part.

Point dipole approximation

This is historically the first and the simplest approach. It is based on the point dipole approximation for the electron. The expression for the pseudocontact shift for a nucleus at position [math]\mathbf{r}[/math] relative to the electron with a magnetic susceptibility tensor [math]\mathbf{\chi }[/math] is:


[math]\sigma =\frac{1}{4\pi {{r}^{3}}}\text{Tr}\left[ \left( \frac{\mathbf{r}\otimes {{\mathbf{r}}^{\text{T}}}}{{{r}^{2}}}-\frac{1}{3} \right)\cdot \mathbf{\chi } \right][/math]


This approach is valid for the distances at which the paramagnetic centre may be viewed as a point - typically at least 10 Angstroms away. The PCS of any nuclei located at closer distances would not fit this model. The Spinach function that calculates pseudocontact shifts from user-specified coordinates and electron susceptibility tensors is ppcs.m. The function that extracts the paramagnetic centre position and the susceptibility tensor from user-specified nuclear coordinates and pseudocontact shifts using the point electron model is ippcs.m.

Note that the magnetic susceptibility tensor may be estimated from the g-tensor for S=1/2 using the following expression:


[math]{\chi _i} = {\mu _0}\mu _B^2\frac{{g_i^2}}{{4{k_B}T}} \approx 1.9571\frac{{g_i^2}}{{T[K]}}[{{Å}^3}][/math]


The function that performs this estimation for an arbitary g-tensor is g2chi.m. The resulting estimate is only valid at high temperature.

Example 1a. PCS calculation using the point dipole approximation
Alt text
Figure 1. Structure of metal porphyrin. Colour code: metal - orange, nitrogen - blue, carbon - grey, hydrogen - light grey

In this example we compute the PCS on the protons of a Co(II)/Cu(II) porphyrin and see how magnetic anisotropy affects PCS. The following steps should be taken:

  • Specify the coordinates of all protons:
    % Porphyrin ring proton coordinates
    nxyz=[ 4.551635888      2.658552774      0.000000000
           2.658552774      4.551635889      0.000000000
          -2.658552774      4.551635888      0.000000000
          -4.551635889      2.658552774      0.000000000
          -4.551635888     -2.658552774      0.000000000
          -2.658552774     -4.551635889      0.000000000
           2.658552774     -4.551635888      0.000000000
           4.551635889     -2.658552774      0.000000000
           4.533874147      0.000000000      0.000000000
           0.000000000     -4.533874147      0.000000000
          -4.533874147      0.000000000      0.000000000
           0.000000000      4.533874147      0.000000000];
  • Specify (or compute from the g-tensor) the susceptibility tensor (cubic Angstrom) in the same coordinate system as the one used for the proton coordinates above:
    % Co(II) g-tensor
    g_co=diag([3.0 3.0 2.0]);

    % Cu(II) g-tensor
    g_cu=diag([2.0 2.0 2.2]);

    % Curie susceptibility tensors
    chi_co=g2chi(g_co,298);
    chi_cu=g2chi(g_cu,298);
  • Specify the metal position
    % Metal position 
    mxyz=[0 0 0];
    % PCS calculation
    point_pcs_co=ppcs(nxyz,mxyz,chi_co);
    point_pcs_cu=ppcs(nxyz,mxyz,chi_cu);
  • Print the output to the console
    % Output
    disp('Pseudocontact shifts [Co, Cu], ppm');
    disp([point_pcs_co point_pcs_cu]);

In the output you see that there are two groups of protons: four with larger absolute shift and eight with smaller shift. These correspond to two symmetry unique protons in the D4h symmetric molecule:

    PCS [Co, Cu], ppm
    5.9471   -0.9991
    5.9471   -0.9991
    5.9471   -0.9991
    5.9471   -0.9991
    5.9471   -0.9991
    5.9471   -0.9991
    5.9471   -0.9991
    5.9471   -0.9991
    9.3458   -1.5701
    9.3458   -1.5701
    9.3458   -1.5701
    9.3458   -1.5701
Example 1b. PCS fitting using the point dipole approximation
Alt text
Figure 2. Comparison of the point dipole model PCS with the experiment for Tm(III) substituted calbindin D9K.

In this example we will process the experimental PCS data from [Balayssac, S.; Jiménez, B.; Piccioli, M. Journal of Biomolecular NMR 2006, 34, 63] for calbindin D9K, in which one of the two calcium ions has been replaced by a thulium ion. The corresponding example file is located in nmr_paramag/calbindin directory of the example set supplied with Spinach.

The calculation proceeds in the following stages:

  • Import the data. It may be pyped in explicitly, as above. In this case, atomic coordinates and pseudocontact shifts are supplied in a Matlab file:
    % Load experimental data
    load('tm_1igv_pcs.mat');
  • Call the point model fit function ippcs.m with the initial guess of the thulium metal position set to [-5 5 -15]:
    % Solve the inverse problem
    [mxyz,chi,pred_pcs]=ippcs([x y z],[-5 5 -15],expt_pcs);

The other variables are contained in the file loaded in the previous line. In practice, several attempts with different paramagnetic centre location guesses may be necessary.

  • Plot the results
    % Plot experimental vs predicted PCS
    figure(); plot(expt_pcs,pred_pcs,'bo'); hold on; grid on;
    plot([min(expt_pcs) max(expt_pcs)],[min(expt_pcs) max(expt_pcs)],'r-');
    xlabel('Experimental PCS, ppm'); ylabel('Predicted PCS, ppm');
  • Print the position of Tm and the traceless part of the magnetic susceptibility tensor in molecular frame
    % Report the parameters
    disp('Susceptibility tensor:'); disp(chi);
    disp('Point electron location:'); disp(mxyz);

The output appears below:

    Susceptibility tensor:
        0.0585   -0.0878    0.0729
       -0.0878   -0.0773    0.0341
        0.0729    0.0341    0.0189
    
    Point electron location:
        3.6610   17.7074   15.4140

It is clear from the figure that the point dipole model does fit the data, but does not fit it particularly well.

Magnetic multipole approximation

The next level of theory that improves on the point dipole approximation is based on the spherical harmonic expansion of the electron probability density:


[math]\rho \left( \mathbf{r} \right)=\sum\limits_{l,m}{\alpha _{l}^{m}\left( r \right)Y_{l}^{m}\left( \theta ,\varphi \right)}[/math]


It works for nuclei positioned outside the bounding sphere of the electron probability density. PCS calculation is performed using lpcs.m function and the fitting is performed using ilpcs.m function. In both cases, the list of [math]l[/math] ranks to be included into the expansion must be supplied by the user.

Example 2a. Computing PCS using the multipole model

The spin density in the copper porphyrin from Example 1a is clearly not a point object; some density is located on the nitrogen atoms because the corresponding isotropic hyperfine couplings are clearly visible in ESR spectra. If the spin populations on all relevant atoms are known (for example, from electronic structure theory calculations, or as fitting parameters), the multipole moments of the overall distribution may be estimated using points2mult.m function. If the spin density is given on a 3D grid, the same function would calculate its multipole moments by numerical integration. A typical calculation would proceed in the following stages:

  • Specify nuclear coordinates and the corresponding Mulliken spin populations:
    % Nuclear coordinates
    xyz=[ 0.000000000000      0.000000000000      0.000000000000
         -1.452280156504      1.452280156496      0.000000000000
          1.452280156504     -1.452280156496      0.000000000000
         -1.452280156496     -1.452280156504      0.000000000000
          1.452280156496      1.452280156504      0.000000000000];

    % Multipole centre coordinates
    mxyz=[0 0 0];
 
    % Mulliken spin populations
    rho=[0.6 0.1 0.1 0.1 0.1]';
  • Compute multipole moments relative to the centre, in this case up to spherical harmonic rank 14
    % Compute multipole moments
    Ilm=points2mult(xyz,mxyz,rho,0:14,'points');
  • Get the susceptibility tensor from the g-tensor
    % Cu(II) g-tensor
    g_cu=diag([2.0 2.0 2.2]);

    % Curie susceptibility tensor
    chi_cu=g2chi(g_cu,298);
  • Compute PCS using the multipole moments
    % Get the pseudocontact shifts
    theor_pcs=lpcs(nxyz,mxyz,0:14,Ilm,chi_cu)

A visual inspection of the resulting PCS values shows that the effect of the spin delocalisation on the PCS is significant: the point dipole model predicts shifts at -1.00 and -1.57 ppm, and the multipole model at -1.21 and -1.81 ppm, respectively. This example is located in nmr_paramag directory of the example set supplied with Spinach.

Example 2b. Fitting PCS to the multipole model
Alt text
Figure 3. Comparison of the multipole model PCS with the experiment for Tm(III) substituted calbindin D9K.

Fitting experimental data to the multipole model can account for the possible mobility of the lanthanide centre; this is often the case in proteins. If we repeat Example 1b identically, except for the solver line:

    % Solve the inverse problem
    [mxyz,chi,~,pred_pcs]=ilpcs([x y z],expt_pcs,[0 1 2],[-5 5 -15]);

where now the fit is to the multipole model with up to the second spherical rank in the probability density distribution, we would see a much better fit - the residual drops by almost a factor of three. Specifying [math]l=\left[ \begin{matrix} 0 & 1 & 2 \end{matrix} \right][/math] is typically sufficient; [math]l=0[/math] makes the calculation equivalent to the point dipole approximation. Spherical ranks higher than 4 are not well constrained by fitting.

Using hyperfine tensors from electronic structure theory

Another way to account for the paramagnetic centre delocalisation is to use hyperfine tensors obtained from electronic structure theory calculations. Those are integrals over the unpaired electron probability density and therefore include all delocalisation effects. Only anisotropic part of the hyperfine tensor affects the PCS; for light nuclei it is just the dipolar hyperfine interaction, for heavy elements the orbital part also contributes.

Spinach has two electronic structure theory package parsers: gparse.m for outputs from Gaussian and oparse.m for outputs from ORCA. Both extract hyperfine tensors from the logs produced by the program. The function hfc2pcs.m then computes PCS by multiplying the traceless part of the hyperfine tensor by the traceless part of the susceptibility tensor and taking the trace of the result. To keep the traces (and therefore the contact part of the full paramagnetic shift), use hfc2pms.m function. The general equation is:


[math]\sigma _{\text{CS}}^{\left( k \right)}+\sigma _{\text{PCS}}^{\left( k \right)}=-\frac{1}{3}\text{Tr}\left[ \frac{{{\mathbf{A}}^{\left( k \right)}}\cdot \boldsymbol{\chi }}{{{\mu }_{0}}{{\gamma }_{\text{e}}}\gamma _{\text{n}}^{\left( k \right)}\hbar } \right][/math]


See http://dx.doi.org/10.1039/C4CP03106G for further information. Note that this method essentially requires the molecule to be rigid; realistically, it is only applicable to small molecules, for which it is strongly recommended.

Example 3a. Computing PCS using hyperfine tensors from DFT

You should refer to ORCA and Gaussian manuals to find out how to calculate hyperfine tensors. An example ORCA output file is located in the nmr_paramag folder of the example set supplied with Spinach. The calculation proceeds in the following stages:

  • ORCA log file is parsed by Spinach:
   % Parse the ORCA log
   props=oparse('cu_porph_hfc.out');
  • Relevant hyperfine tensors are extracted:
   % Extract hyperfine tensors
   hfcs=props.hfc.full.matrix(26:37);
  • PCS values are computed
   % Compute DFT PCS
   for n=1:numel(hfcs)
        hfc_pcs_cu(n)=hfc2pcs(gauss2mhz(hfcs{n}),chi_cu,'1H');
   end

The same susceptibility tensor as in Examples 1a and 2a is used here. The point model in Example 1a was predicting PCS at -1.00 and -1.57 ppm; the multipole model in Example 2a put the PCS at -1.21 and -1.81 ppm; DFT hyperfines give even larger PCS values: -1.53 and -1.90 ppm.

Distributed paramagnetic centre model

The most general approach that works for any paramagnetic centre probability distribution is based on the partial differential equation obeyed by the pseudocontact shift, described in http://dx.doi.org/10.1039/C4CP03106G and implemented in the functions kpcs.m for the direct problems, and ipcs.m for the inverse problems. This approach is fully numeric and operates with the probability density defined on three-dimensional grids.

Example 4a. Computing PCS using DFT spin density
Figure 4. DFT spin density of Cu(II) porphyrin.
Figure 5. PCS field of Cu(II) porphyrin.

You should refer to ORCA and Gaussian manuals to find out how to calculate spin density cubes. An example ORCA cube is located in the nmr_paramag folder of the example set supplied with Spinach. The calculation proceeds in the following stages (some of the parameters have already been computed in Examples 2a and 3a):

  • An ORCA spin density cube is parsed by Spinach and padded
   % Parse ORCA cube and pad the density with zeros to avoid PBC effects 
   pad_size=2; [density,ext]=ocparse('cu_porph_sd80.spindens.3d',pad_size);
  • PCS from distributed density is computed:
   % Solve Kuprov equation
   [pde_pcs_cu,pcs_cube]=kpcs(density,chi_cu,ext,nxyz,'fft');

The partial differential equation is solved using the Fourier transform, and the spin density cube therefore requires padding by zeros on all sides to avoid self-interaction artefacts from the periodic boundary conditions - hence the pad argument in ocparse.m function.

Comparison of the PCS from the DFT hyperfine and from the partial differential equation (PDE) with imported DFT spin density defined on 80x80x80 grid shows the good agreement (-1.53 -1.90 ppm from DFT and -1.43 -1.84 ppm from PDE). Using this method one can compute DFT spin density for a small subsystem (i.e. protein active site) and recompute PCS in full system (entire protein) using kpcs.m without a need to do DFT for the full system.

Example 4b. Reconstructing PCS tag distributions in proteins

When pseudocontact shifts and the corresponding nuclear coordinates are available, it is possible to approximately reconstruct the three-dimensional probability density of the paramagnetic centre that generated the pseudocontact shifts. The particulars are given in the forthcoming paper from the IK group; multiple examples are available in nmr_paramag/calbindin and nmr_paramag/carb_anh directories of the standard example set supplied with Spinach. The primary function involved is ipcs.m; this section contains some practical advice on using it. A typical reconstruction run would proceed in the following stages:

1. Compile the primary data. You will need the following: (a) Cartesian coordinates of the nuclei for which the PCS is measured, as Nx3 array in Matlab - call this array 'xyz'; (b) Cartesian coordinates of all atoms in the molecule, as Mx3 array in Matlab - call this array 'xyz_all'; (c) Pseudocontact shifts, in the same order as the coordinates in 'xyz' variable, as Nx1 vector. Include all pseudocontact shifts that you managed to measure, including those in the immediate vicinity of the paramagnetic centre.

2. Locate the paramagnetic centre. You likely have an approximate idea of where your paramagnetic centre is located. Use that location as the initial guess for the point dipole model fit, for example:

    [mxyz,chi,pred_pcs]=ippcs(xyz,[-27.0  13.0  18.0],expt_pcs);

Optionally, plot the theoretical against the experimental PCS, for example:

    figure(); plot(expt_pcs,pred_pcs,'bo'); hold on; grid on;
    plot([min(expt_pcs) max(expt_pcs)],[min(expt_pcs) max(expt_pcs)],'r-');
    xlabel('Experimental PCS, ppm'); ylabel('Predicted PCS, ppm');

If the fit is very poor, re-run from a different initial guess for the metal position. Even the best point model fit probably isn't particularly good, or you wouldn't be reading this, but it would give you a better idea of where your paramagnetic centre roughly is.

3. Run the multipolar fit. Use the paramagnetic centre position from the previous stage as the initial guess, for example:

    [mxyz,chi,~,pred_pcs]=ilpcs(xyz,expt_pcs,[0 1 2],[-27.0  13.0  18.0]);

Optionally, plot the theoretical against the experimental PCS again, for example:

    figure(); plot(expt_pcs,pred_pcs,'bo'); hold on; grid on;
    plot([min(expt_pcs) max(expt_pcs)],[min(expt_pcs) max(expt_pcs)],'r-');
    xlabel('Experimental PCS, ppm'); ylabel('Predicted PCS, ppm');

The fit would likely be much improved - Figure 6 below provides an illustration.

Figure 6. Comparison of the point model fit (left) with the multipolar model fit (right) for a mutant of human carbonic anhydrase.

The important parameters to keep at this point are the effective magnetic susceptibility tensor ('chi') and the magnetic multipole centre ('mxyz').

4. Run the draft three-dimensional reconstruction. The ipcs.m function performing three-dimensional reconstructions is very rich in options; it is recommended that you read its manual section carefully. The following options are recommended:

    % Set inverse problem parameters
    parameters.plot={'diagnostics','density','molecule','tightzoom','box'};
    parameters.box_cent=[### ### ###];
    parameters.box_size=[ 25.0 25.0 25.0];
    parameters.margins=50*ones(1,6);
    parameters.confine=[2.0 12.0];
    parameters.sharpen=0.0;
    parameters.xyz_all=xyz_all;
    parameters.expt_pcs=expt_pcs;
    parameters.xyz=xyz;
    parameters.chi=chi;
    parameters.gpu=0;
  • The first line tells Spinach to produce maximum visual output - see ipcs.m options for further details.
  • The second one specifies the coordinates of the centre of the source box in Angstrom - use the multipole coordinates from the previous stage here.
  • The third line specifies the size of the source box in Angstrom - set the three numbers to the suspected width of your distribution; 25 Angstrom is usually sufficient.
  • The fourth line deals with preventing the periodic boundary condition effects from appearing - it tells Spinach to run the solver with the box that has been padded with 50 Angstroms of zeros on each side; that is normally sufficient.
  • The fifth line tells Spinach to confine the solution space to include all points that are at least 2.0 Angstroms away from any atom in 'xyz_all' and at most 12.0 Angstrom from the nearest atom in 'xyz_all'. This corresponds to the set of physically realistic positions of a lanthanide atom in a spin label attached to the protein surface. If you are reconstructing the distribution of the lanthanide that is internal to the structure, you would need to reduce the first number.
  • The sixth line switches off source sharpening - a penalty functional that drives values below average to zero and values above average to infinity. Increase the coefficient if you have minor low-amplitude clouds in your solution, but remember that this algorithm is hard to justify on physical grounds.
  • Lines 7-10 simply provide the information we already have from the previous sections.
  • The last line may be used to enable GPU processing if you have an NVidia GPU. It assumes that you have no NVidia GPUs in the system. If you do, set the value to 1; the calculations would go faster on any serious gaming card and massively faster if you have a Tesla coprocessor installed.

Run a rough reconstruction with these options;

    [source_cube,ranges]=ipcs(parameters,64,0.125);

where 64 is the number of points in each direction of the grid and 0.125 is a fairly typical value of the Tikhonov regularisation parameter. Visually inspect the answer (see Figure 7 below for an example).

Figure 7. Centering the variational cube during the draft reconstruction process. Left: the density is smeared along one of the cube sides, the cube must be moved right. Right: correctly positioned variational cube.

If the red cloud is smudged against any side of the cube, move it in that direction. Increase and shrink the cube as necessary until the red cloud sits comfortably within it.

5. Run the L-curve analysis. Double the size of the cube on every direction and run the reconstruction process multiple times, recording the value of the least squares penalty and the Tikhonov regularisation penalty for a range of regularisation parameters. There are multiple examples in the standard example set, the following is typical:

    % Regularisation parameter array
    lam=10.^linspace(-2.0,2.0,30);

    % Result arrays
    err=zeros(30,1); reg=zeros(30,1);

    % Run a parallel loop
    parfor n=1:30 %#ok<*PFOUS,*PFBNS>
        [~,~,~,err(n),~,reg(n)]=ipcs(parameters,64,lam(n));
        reg(n)=reg(n)/lam(n);
    end

    % L-curve analysis
    figure(); S=lcurve(lam,err,reg,'log');
    disp(['Optimum smoothing parameter: ' num2str(S)]);

The optimum value of the regularisation parameter corresponds to the point of maximum curvature on the L-curve. A typical example appears in Figure 8 below.

Figure 8. A typical L-curve with the maximum curvature point indicated by a red circle.

6. Run the accurate reconstruction. Use the optimal value of the Tikhonov regularisation parameter to run the reconstruction on a finer grid. An iterative approach is suggested, for example:

    % Solve and refine the grid
    for n=[64 128 256]
        [source_cube,ranges]=ipcs(parameters,n,0.125);
        parameters.guess=source_cube;
    end

At the end of the calculation (very long if you don't have a Tesla card) you will obtain a high quality reconstruction.

7. Refine the effective susceptibility tensor. Optionally, Spinach can now use your probability density to update the susceptibility tensor that you have obtained from the multipolar fit. Run the following call:

    [chi,~]=chi_eff(source_cube,ranges,xyz,expt_pcs);

to obtain your new susceptibility tensor. If it is significantly different from the previous one, repeat Steps 6 and 7 until self-consistency is achieved.


Version 1.9, authors: Elizaveta Suturina, Ilya Kuprov