Guess j pro.m
A reasonable attempt at a function that guesses and assigns all J-couplings in a protein.
This function assigns J-couplings in proteins from atomic coordinates using semiempirical estimates. It is a graph-theoretical estimator with the following stages:
- The molecular bonding graph is partitioned into connected subgraphs of size two, and one-bond J-couplings are assigned from a complete database of atom pairs. Our experienceindicates that there are fewer than 100 unique connected atom pairs in regular proteins, and that most one-bond J-couplings within those pairs can be either found in the literature, or measured in individual amino acids, or estimated with sufficient accuracy using electronic structure theory software.
- The molecular bonding graph is partitioned into connected subgraphs of size three, and two-bond J-couplings assigned from a complete database of connected atom triples. The number of unique connected atom triples in proteins is also reasonable – fewer than 150 in regular proteins, a small enough number for an exhaustive list to be compiled from experiments, literature and electronic structure theory estimates.
- The molecular bonding graph is partitioned into sequentially connected subgraphs of size four and dihedral angles are computed from atomic coordinates, allowing three-bond J-couplings to be assigned from a complete database of Karplus curves. Karplus curves are a well-researched topic, with specific data available for the backbone and less accurate generic curves available for the rest of the structure. The number of unique sequentially connected atom quartets found in proteins (fewer than 300, many belonging to similar structural types) is sufficiently small for a complete database of Karplus curves to be compiled from literature data, experiments, and electronic structure theory estimates.
J-couplings across more than three bonds are ignored.
aa_num - nspins x 1 vector giving the number of the amino acid to which each spin belongs aa_typ - nspins x 1 cell array of strings giving the PDB identifier of the amino acid to which each spin belongs (e.g. 'TYR') pdb_id - nspins x 1 cell array of strings giving the PDB identifier of the protein atom type to which each spin belongs (e.g. 'HE2') coords - nspins x 1 cell array of 3-vectors giving cartesian coordinates of each spin in Angstrom
jmatrix - nspins x nspins sparse matrix of J-couplings in Hz
The function accepts, for example, the output of read_pdb_pro.m:
% Parse the PDB file [pdb_aa_num,pdb_aa_typ,pdb_atom_id,pdb_coords]=read_pdb_pro('1D3Z.pdb',1); % Guess the J-couplings jmatrix=guess_j_pro(pdb_aa_num,pdb_aa_typ,pdb_atom_id,pdb_coords);
- You can add or modify the J-coupling database by editing the function text.
- Atoms in the subgraph descriptors are listed alphabetically to make the descriptors unique.
- The four numbers in the subgraph descriptors refer to the bonding order, e.g. [1, 3, 2, 4] means that the first atom in the descriptor is bonded to the third, which is bonded to the second, which is bonded to the fourth. The coupling in this case is between atom 1 and atom 4 in the descriptor.
- J-couplings produced by this functions are rough estimates. For accurate protein work you must supply your own J-couplings.
- This is an auxiliary function that is called by protein.m; direct calls are discouraged.