Cover Figure (left; higher resolution on right).
Caption: Systematic Search for Diacylglycerol Kinase Structure.
We present a general method for structure determination of protein
homoöligomers and demonstrate the method on Diacylglycerol kinase
(DAGK). We conclude that the differences in the published NMR and
crystal structures are due to limitations of current NMR structure
determination methodology. We overcame these limitations by using a
new Fold-Operator Theory to systematically search the space of folds
and predict distinct fold topologies. The image illustrates the
concept of a large number of different folds of DAGK, which all
plausibly satisfy the NMR data. The toroidal configuration space
illustrates how Fold-Operator Theory obtains a systematic search
algorithm over the different possible folds. We report 7 new folds
that are topologically distinct, and differ by up to 12 Å
(transmembrane helix backbone RMSD) from either the previously
published NMR structure or the crystal structure.
Image created for this paper by Lei Chen and Yan
Liang (L2Molecule.com).
Abstract. Protein structure determination by NMR has predominantly relied on simulated annealing-based conformational search for a converged fold using primarily distance constraints, including constraints derived from nuclear Overhauser effects (NOEs), paramagnetic relaxation enhancement (PRE), and cysteine crosslinkings. Although there is no guarantee that the converged fold represents the global minimum of the conformational space, it is generally accepted that good convergence is synonymous to the global minimum. Here, we show such a criterion breaks down in the presence of large numbers of ambiguous constraints from NMR experiments on homoöligomeric protein complexes. A systematic evaluation of the conformational solutions that satisfy the NMR constraints of a trimeric membrane protein, Diacylglycerol kinase (DAGK), reveals 9 distinct folds, including the reported NMR and crystal structures. This result highlights the fundamental limitation of global fold determination for homoöligomeric proteins using ambiguous distance constraints and provides a systematic solution for exhaustive enumeration of all satisfying solutions.
Summary.
Simulated annealing is a primary method for structure determination of
proteins by nuclear magnetic resonance (NMR)
spectroscopy. NMR
restraints and biophysical principles are encoded into an energy
function whose minimization results in models of the protein structure
that satisfy the restraints. If the method consistently returns
similar structures that adequately satisfy the restraints, the
structural ensemble is considered well-converged and the structure
determination successful, although the low restraint violation
and convergence does not necessarily mean the structure is
accurate. The main strength of simulated annealing is its ability to
transform a coarse structural model into a more refined structure with
improved restraint satisfaction. Where the method falls short is its
inability to exhaustively sample topologically distinct structural
models. Therefore, it can become trapped in the local minima of the
energy landscape, thus missing the genuine fold(s) with similar or
lower energies. Further complicating the situation, even if the
global minimum structure of the energy function could be obtained,
small inaccuracies in the energy function (e.g. due to approximation
of complex physical phenomena or misinterpretation of even a few
experimental distance constraints) could cause a genuine fold to be
incorrectly ranked with a higher energy than the erroneous
folds. Although such a situation is considered rare when all distance
constraints are uniquely assigned, the odds increase significantly in
the presence of ambiguous distance restraints for structure
determination of homoöligomeric protein complexes.
Ambiguous distance restraints (ADRs) refer to distance information (such as NOEs) that cannot be uniquely attributed to a single pair of atoms. Since the chemical shifts of equivalent atoms in all subunits in a homoöligomeric complex are identical and thus indistinguishable, ADRs are unavoidable for distance measurements in trimers and higher-order homoöligomers. We refer to this phenomenon as subunit ambiguity. For dimers, separating intra- vs inter-subunit NOEs using X-filtered NOESY is sufficient to resolve subunit ambiguity. For trimers and higher-order oligomers, even after a distance restraint has been classified as inter-subunit, it still has at least two possible assignments and is still ambiguous. ADRs consider degenerate atom pairs by using an average function derived from a mean field approximation. Although it has been demonstrated that genuine interactions can be extracted from ADRs, these methods are prone to becoming trapped in local minima since they rely heavily on the initial fold to remove assignment ambiguity. The energy landscapes for homoöligomers contain a large number of minima with similarly low energy, so when simulated annealing methods using ADRs become trapped in local minima, these methods can fail to report satisfying folds from other minima.
This situation is further exacerbated in the case of homoöligomeric membrane proteins, for which dense restraint collection is often impractical. In the case of Diacylglycerol Kinase from Escherichia coli (henceforth, simply DAGK), a membrane-associated homo-trimer, two different structures have been published. The solution NMR structure of DAGK, determined using ambiguously-assigned distance restraints, possesses a domain-swapped subunit interface, while the crystal structure has a subunit with a more compact conformation and without domain-swapping.
Here we show that the difference between the two structures is due to the local minimum limitations of current methodology for NMR structure determination. We demonstrate that this limitation can be mitigated by searching over topologically distinct folds using a systematic approach called Fold-Operator Theory. Once an initial satisfying fold is discovered, mathematical operators transform the fold into alternate folds. The operators define a group action on the configuration space of protein folds. These alternative folds can be subsequently refined using traditional simulated annealing methods and evaluated for restraint satisfaction. Using this systematic approach, we found 48 distinct folds of DAGK, among which 9, including the published NMR and crystal folds, upon energy minimization, satisfied experimental restraints.
Significance.
We have demonstrated our method on DAGK, showing how to find a
remarkable variety of satisfying folds, but the method can also be
applied to other homoöligomeric proteins where ambiguous restraints
necessarily hinder structure determination with simulated annealing.
In some cases, structures designed from one fold changed to another fold during refinement. There are eight such switches in total, which are shown in our paper. When viewed as a dynamical system, the network of fold switches has two prominent attractors. One is at fold O (the NMR fold) and the other is at fold B, which is not related to any published structure. See the blue letters in our paper to find the names of the folds. Six out of the top seven structures by Xplor total energy and six out of the top seven structures by RMS violation index were either seeded from, or switched to, one of these two attractor folds.
Surprisingly, the best fold by Xplor total energy was neither the fold of the NMR structure nor the fold of the crystal structure. Fold B has the lowest Xplor total energy, and the second lowest RMS violation index. It is topologically distinct from both the NMR and the crystal folds and its three refined structures differ by 12.31-12.87 Å transmembrane helical backbone RMSD from the published NMR structure and by 12.77-12.83 Å from the published crystal structure. It also satisfies different subunit assignments of the intermolecular distance restraints than either published structure, which shows fold-operator theory was able to find previously unknown solutions to the restraint satisfaction problem for DAGK.
It is not yet know whether this new putative fold B has biological significance for DAGK. However, it must be emphasized that currently, based on all NMR measurements to date, (1) fold B is vastly different from the published structures, (2) it cannot be excluded as a possible structure, and, moreover, (3) it fits the NMR restraints as well or better than the two published folds.
Conclusions.
We have presented a general method for structure determination of
protein homoöligomers and demonstrated the method on DAGK. We
conclude that the differences in the published NMR and crystal
structures are due to limitations of current NMR structure
determination methodology. We overcame these limitations by using
a new fold-operator theory to explicitly search the space of folds and
predict distinct fold topologies for further investigation. These
folds were used to reduce (and in some cases eliminate) ambiguity in
restraint assignments which lessened the difficulty of subsequent
refinement of seed structures in Xplor-NIH. By explicitly performing a
search over topologically distinct folds, we avoided the implicit fold
search performed by local minimization methods which can become
trapped in local minima and therefore fail to report satisfying
solutions. Using explicit fold-space search methods to address the
limitations of local minimization techniques such as simulated
annealing enables robust structure determination for difficult
homoöligomeric systems, particularly membrane associated systems
hindered by the availability of only sparse and ambiguous restraints.
Cover: [ PDF, Jpeg (Lower res) ]