Abstract
Defining the strength and geometry of the hydrogen bond in protein structures has been a challenging task since early days of structural biology. We apply a novel statistical machine learning technique, known as contrastive divergence, in the context of efficient Monte Carlo sampling to estimate both the hydrogen bond strength and the geometric characteristics of strong hydrogen bonds, from a dataset of structures representing a variety of different protein folds. In good agreement with earlier experimental estimates, we determine the strength of the hydrogen bond to be between 1.1 and 1.5 kcal/mol. The geometry of strong hydrogen bonds features an almost linear arrangement of all four atoms involved in hydrogen bond formation. We estimate that about a quarter of all hydrogen bond donors and acceptors participate in strong intra-peptide hydrogen bonds.