# deer_lib_gen.m

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Generates a library of distance distributions and corresponding simulated primary DEER traces using the parameters supplied, in a 4 step process:

1. The batch of simulated spin label distributions are generated as a randomly selected number of skew normal distributions.
2. $p(r)=\frac{2}{\sigma \sqrt{2 \pi}}e^{- \frac{(r-r_0)^2}{2 \sigma ^2}} \int_{- \infty}^{ \alpha \left (\frac{(r-r_0)}{\sigma} \right )} e^{- \frac{t^2}{2}}dt$
3. The DEER form factor d(t) is computed from the distributions using the kernel for DEER in the presence of exchange coupling (Equation 2 in our paper).
4. $\gamma(r,t)=\sqrt{\frac{\pi}{6Dt}} \left [ cos [(D+J)t]FrC \left [ \sqrt{\frac{6Dt}{\pi}} \right ] + sin [(D+J)t]FrS \left [ \sqrt{\frac{6Dt}{\pi}} \right ] \right ]$
5. Simulated background, b(t) and additive noise tracks n(t) are mixed with the DEER form factor;
6. $s(t)=[1-\lambda+\lambda d (t)]b(t)+n(t)$
• the background signal is generated as a stretched exponential function,
7. $b(t)=exp \left [ -(kt)^{n/3} \right ]$
• the noise track is uncorrelated, representing the instrumental noise expected during the indirect acquisition in the DEER method.
8. The distance distributions and DEER traces are then scaled to the neural network activation range:
• distributions are uniformly scaled to 0.75,
• DEER trace are scaled and shifted to make the first and last points equal to 1 and 0 respectively.

For each pair in the training set the variables controlling the shape of the distance distribution; the amount of exchange coupling to include; the form of the background contribution to the signal; and the level of noise are randomly selected from the ranges provided in the parameters structure.

## Syntax

    deer_lib_gen(file_name,parameters)

    [output arguments]=deer_lib_gen(file_name,parameters)


## Arguments

    file_name      - name of output *.mat file, include
a full path to specify the output
location.

    parameters     - training set parameters, with fields
described here.


## Outputs

The function output arguments are listed below, in the order expected:

    time_grid      - the time axis for the DEER traces, in
seconds. A row vector.

dist_grid      - distance grid for the distributions,
in Angstroms. A row vetor.

dist_distr_lib - all distance distributions, a horizontal
array of column vectors.

deer_ffact_lib - all DEER form factors, a horizontal
array of column vectors.

background_lib - all background signals, a horizontal
array of column vectors.

deer_trace_lib - all primary DEER traces, a horizontal
array of column vectors.

noise_line_lib - all noise tracks, a horizontal array
of column vectors.

exchange_lib   - exchange coupling scalar used to in
generating each trace, a row vector.

parameters     - the parameters structure, unchanged.


The function may be called with a specified file name (including a full path), in which case the output arguments are also saved in a .mat file at that location. If the file name input is left empty then the database is not saved.

        file_name=[];


## Examples

The example below first loads the netset parameters from the ensemble optimised for all peak widths, and then generates a library of 1000 trace/distribution pairs. The example may be run from the examples/deernet/ directory.

	% Load the training set parameters
run('net_set_any_peaks/netset_params.m');

% Specify number of traces to produce
parameters.ntraces=1000;

% Set the training database name
file_name='dlg_example_set.mat';

% Generate the training library
[time_grid,dist_grid,dist_distr_lib,...
deer_ffact_lib,background_lib,deer_trace_lib,...
noise_line_lib,exchange_lib,parameters]=deer_lib_gen(file_name,parameters);


This example will add the output libraries to the MATLAB workspace, as well as saving them in the working directory as "dlg_example_set.mat"

## Notes

• As the dipolar modulation frequency is a cubic function of the inter-spin distance, a scaling relationship exists betweent the distance range and the duration of the DEER signal.

$\frac{t_A}{r_A^3}=\frac{t_B}{r_B^3}$

• An important factor when generating data for training is the dynamic range - the ratio between the longest and shortest distances represented in the training set.
• The training set DEER traces should be sufficiently discretised (parameters.npoints) to reproduce all frequencies present.