From Spinach Documentation Wiki
Revision as of 13:43, 20 July 2018 by Worswick (talk | contribs) (Created page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Generates a library of distance distributions and corresponding simulated primary DEER traces using the parameters supplied, in a 4 step process:

  1. The batch of simulated spin label distributions are generated as a randomly selected number of skew normal distributions.
  2. \[p(r)=\frac{2}{\sigma \sqrt{2 \pi}}e^{- \frac{(r-r_0)^2}{2 \sigma ^2}} \int_{- \infty}^{ \alpha \left (\frac{(r-r_0)}{\sigma} \right )} e^{- \frac{t^2}{2}}dt \]
  3. The DEER form factor d(t) is computed from the distributions using the kernel for DEER in the presence of exchange coupling (Equation 2 in our paper).
  4. \[\gamma(r,t)=\sqrt{\frac{\pi}{6Dt}} \left [ cos [(D+J)t]FrC \left [ \sqrt{\frac{6Dt}{\pi}} \right ] + sin [(D+J)t]FrS \left [ \sqrt{\frac{6Dt}{\pi}} \right ] \right ] \]
  5. Simulated background, b(t) and additive noise tracks n(t) are mixed with the DEER form factor;
  6. \[s(t)=[1-\lambda+\lambda d (t)]b(t)+n(t) \]
    • the background signal is generated as a stretched exponential function,
  7. \[b(t)=exp \left [ -(kt)^{n/3} \right ] \]
    • the noise track is uncorrelated, representing the instrumental noise expected during the indirect acquisition in the DEER method.
  8. The distance distributions and DEER traces are then scaled to the neural network activation range:
    • distributions are uniformly scaled to 0.75,
    • DEER trace are scaled and shifted to make the first and last points equal to 1 and 0 respectively.

For each pair in the training set the variables controlling the shape of the distance distribution; the amount of exchange coupling to include; the form of the background contribution to the signal; and the level of noise are randomly selected from the ranges provided in the parameters structure.


    [output arguments]=deer_lib_gen(file_name,parameters)


    file_name      - name of output *.mat file, include
		     a full path to specify the output
    parameters     - training set parameters, with fields 
		     described here.


The function output arguments are listed below, in the order expected:

    time_grid      - the time axis for the DEER traces, in
		     seconds. A row vector.

    dist_grid      - distance grid for the distributions,
		     in Angstroms. A row vetor.

    dist_distr_lib - all distance distributions, a horizontal
		     array of column vectors.

    deer_ffact_lib - all DEER form factors, a horizontal
		     array of column vectors.

    background_lib - all background signals, a horizontal
		     array of column vectors.

    deer_trace_lib - all primary DEER traces, a horizontal
		     array of column vectors.
    noise_line_lib - all noise tracks, a horizontal array
                     of column vectors.

    exchange_lib   - exchange coupling scalar used to in
		     generating each trace, a row vector.

    parameters     - the parameters structure, unchanged.

The function may be called with a specified file name (including a full path), in which case the output arguments are also saved in a .mat file at that location. If the file name input is left empty then the database is not saved.



The example below first loads the netset parameters from the ensemble optimised for all peak widths, and then generates a library of 1000 trace/distribution pairs. The example may be run from the examples/deernet/ directory.

	% Load the training set parameters

	% Specify number of traces to produce

	% Set the training database name

	% Generate the training library

This example will add the output libraries to the MATLAB workspace, as well as saving them in the working directory as "dlg_example_set.mat"


  • As the dipolar modulation frequency is a cubic function of the inter-spin distance, a scaling relationship exists betweent the distance range and the duration of the DEER signal.

\[\frac{t_A}{r_A^3}=\frac{t_B}{r_B^3} \]

  • An important factor when generating data for training is the dynamic range - the ratio between the longest and shortest distances represented in the training set.
  • The training set DEER traces should be sufficiently discretised (parameters.npoints) to reproduce all frequencies present.

See also

netset_curate.m, train_one_net.m

Version 2.2, authors: Ilya Kuprov, Steve Worswick, Gunnar Jeschke