Cover Article

"Protein loop closure using orientational restraints from NMR data,"
by C. Tripathy, J. Zeng, P. Zhou and B. R. Donald.*
Duke University
PROTEINS Volume 80, Issue 2, pages 433-453, February 2012

[PDF, Cover (large, PNG), Cover (pdf) ]
HTML: Online at journal, DOI: 10.1002/prot.23207

Abstract. Protein loops often play important roles in biological functions. Modeling loops accurately is crucial to determining the functional specificity of a protein. Despite the recent progress in loop prediction approaches, which led to a number of algorithms over the past decade, few rigorous algorithmic approaches exist to model protein loops using global orientational restraints, such as those obtained from residual dipolar coupling (RDC) data in solution NMR spectroscopy. In this article, we present a novel, sparse data, RDC-based algorithm, which exploits the mathematical interplay between RDC-derived sphero-conics and protein kinematics, and formulates the loop structure determination problem as a system of lowdegree polynomial equations that can be solved exactly, in closed-form. The polynomial roots, which encode the candidate conformations, are searched systematically, using provable pruning strategies that triage the vast majority of conformations, to enumerate or prune all possible loop conformations consistent with the data; therefore, completeness is ensured. Results on experimental RDC datasets for four proteins, including human ubiquitin, FF2, DinI and GB3, demonstrate that our algorithm can compute loops with higher accuracy, a 3- to 6-fold improvement in backbone RMSD, versus those obtained by traditional structure determination protocols on the same data. Excellent results were also obtained on synthetic RDC datasets for protein loops of length 4, 8 and 12 used in previous studies. These results suggest that our algorithm can be successfully applied to determine protein loop conformations, and hence will be useful in high-resolution protein backbone structure determination, including loops, from sparse NMR data.